An Unsupervised Deep Learning Framework for Quantitative Breast Density Estimation from Mammograms
Khaldoon Alhusari, Salam DhouBreast cancer is the most commonly diagnosed cancer in women, with early detection playing a critical role in clinical outcomes. Mammography remains the standard screening modality, producing X-ray images used to assess mammographic density, a key indicator of the proportion of fibroglandular tissue within the breast. The Breast Imaging-Reporting and Data System (BI-RADS) classification system is widely used to report density across four qualitative categories. High density can obscure malignancies and is independently associated with elevated breast cancer risk. Manual interpretation of mammographic density is prone to subjectivity and inter-observer variability, and supervised learning-based estimation methods trained on subjective labels may reflect this inherent subjectivity. This work proposes an unsupervised framework for quantitative breast density estimation that requires no labeled data in its core pipeline. Expert labels are used exclusively to calibrate post hoc discretization thresholds for binary classification, enabling comparison with supervised methods in the literature. The main contributions include: (i) an adaptive Region of Interest (ROI) extraction algorithm, (ii) a Convolutional Neural Network (CNN) based unsupervised segmentation pipeline tuned for mammographic density separation, (iii) a novel confidence metric for identifying unreliable segmentation outputs, (iv) a label correction mechanism for low-confidence cases, and (v) a confidence-filtered majority voting scheme for per-patient classification. The framework is evaluated on two public datasets, namely DDSM and INbreast, with segmentation performance yielding Silhouette scores exceeding 0.92. Agreement with expert labels reaches 71.43% and 79.28% for DDSM and INbreast, respectively. Image-level clustering quality assessment confirms effective unsupervised labeling, with Silhouette scores averaging 0.57 for DDSM and 0.50 for INbreast. The proposed framework provides a practical and non-subjective model for quantitative breast density estimation, with potential utility as a decision-support tool for radiologists that can be considered in clinical practice after further investigation.