DOI: 10.3390/bioengineering13070765 ISSN: 2306-5354

Benign Share Benefit to Malignant: Balanced Mixing on Feature Space for Imbalanced Breast Cancer Classification

Farchan Hakim Raswa, Muhammad Fadlurrohman, Bach-Tung Pham, Ika Candradewi, Afiahayati, Ming-Hsiang Su, Chung-I Huang, Kuo-Chen Li, Shih-Lun Chen, Yung-Hui Li, Jia-Ching Wang

A deep learning model with an imbalanced mammography dataset can bias models toward common benign BI-RADS categories and reduce recognition of less frequent malignant or high-risk categories. To address this issue, we propose B2M (Benign Share Benefit to Malignant), a model-agnostic framework for imbalance-aware multi-class BI-RADS classification in C-View mammography. B2M uses a two-phase training strategy that combines dual sampling with feature-space mixing. In Phase I, the model is trained with dual sampling, integrating instance-based and class-balanced sampling to increase minority-class representation while preserving majority-class diversity. In Phase II, the model is fine-tuned with feature-space mixing using samples from the two sampling streams. A soft-target regularization objective supervises the mixed features using labels from both streams, encouraging smoother decision boundaries across BI-RADS categories. We evaluated B2M on an imbalanced mammography cohort from the C-View EMBED dataset using stratified 5-fold cross-validation across multiple CNN backbones. C-View is a synthesized 2D mammographic image generated from 3D digital breast tomosynthesis data, capturing DBT-derived structural information while requiring less memory and computation than processing the full 3D image volume. Among these experiments, ResNeXt-50 with B2M achieved the highest balanced accuracy and Macro-F1 scores compared with the evaluated oversampling and mixing-based methods. This improvement requires an offline training-time overhead of approximately 2.81×, but it does not increase inference cost. Overall, the results suggest that B2M may be useful for imbalanced multi-class BI-RADS classification in C-View mammography. However, the findings are based on the EMBED cohort, and further validation, including external and prospective evaluation, is needed before clinical use.

More from our Archive