Addressing Clinical Ambiguity in Breast Density Assessment: A Hybrid Multi-View Deep Learning Framework for BI-RADS B vs. C Classification
Bochra Triqui, Hicham Kaid-SlimaneBackground/Objectives: Mammographic breast density assessment represents a crucial step in the detection of breast cancer, risk stratification, and lesion visibility. However, it is generally quite difficult to distinguish between intermediate density categories, especially BI-RADS B and C, because of the high inter-observer variability between radiologists. This hindrance encourages researchers to develop novel robust and interpretable automated techniques. Thus, a hybrid multi-view deep learning framework based on EfficientNet-B4 and U-Net for BI-RADS B vs. C classification is presented in this work. Methods: The proposed model utilizes the fusion of craniocaudal (CC) and mediolateral oblique (MLO) views at the feature level to capture complementary fibroglandular anatomical and tissue characteristics. Furthermore, the interpretability of the model is ensured with Grad-CAM, which highlights the regions relevant to decision-making. The proposed approach is evaluated on the RSNA mammography dataset, which consists of 23,513 images of 4796 patients, after a patient-wise split, for the purpose of preventing data leakage and guaranteeing a clinically realistic assessment. This protocol offers a more reliable evaluation than the image-by-image assessment strategies usually employed in previous studies. Results: The experimental outcomes obtained indicate an accuracy of 87% and an area under the curve (AUC) of 94.40%. These performance levels are consistent with those reported in recent research, suggesting competitive performance in this complex and difficult classification task. These enhancements are statistically significant compared to the assessed reference values, as confirmed using statistical analysis based on McNemar’s and DeLong’s tests. Furthermore, the qualitative evaluation carried out by experienced and certified radiologists corroborates the clinical pertinence of the highlighted regions. Conclusions: Furthermore, it is found that combining multi-view deep learning and explainable AI within the proposed framework is consistent with observed inter-observer variability and may support more consistent breast density assessment and clinical decision-making. However, further prospective multicenter validation is necessary before any clinical application.