Architecture‐Aware Block‐Level Pruning Across CNN Models for Breast Cancer Histopathology Image Classification
Samira Pouramin, Mahdi Nangir, Hadi SeyedarabiABSTRACT
Convolutional neural networks (CNNs) have achieved strong performance in medical image analysis; however, their deployment in edge computing environments remains challenging. This paper introduces an architecture‐aware block‐level pruning framework that progressively removes implementation‐level blocks from sequential (VGG16, VGG19) and branched (ResNet50, InceptionV3) CNN architectures. Each pruning step is followed by training under one of three paradigms—training from scratch, feature extraction, or fine‐tuning—and performance is evaluated through the accuracy–efficiency trade‐off and complementary metrics. To exploit the diverse and architecture‐dependent representations learned by the best‐performing models in each regime, a stacking ensemble strategy is employed. Experiments on the PatchCamelyon dataset reveal a distinctly architecture‐ and regime‐dependent pruning behavior: sequential models trained from scratch and fine‐tuned exhibit monotonic accuracy–efficiency degradation, while under feature extraction they show non‐monotonic patterns; branched architectures display non‐monotonic behavior across all regimes due to their multi‐path or residual block redundancy. Under moderate pruning, pruned Inception‐V3 model outperforms prior state‐of‐the‐art (SOTA) with a 5.47% accuracy improvement; under strong pruning, pruned Inception‐V3 and ResNet‐50 models exceed the literature with 4.37% and 5.38% accuracy enhancement, respectively. In aggressive pruning, two different pruned configuration of Inception‐V3 achieve superior performance relative to SOTA with 1.21% and 10% accuracy improvements. The stacking ensembles further yield 1.78%–6.61% accuracy gains, 2%–5% AUC improvements, and 4%–6% increases in F1‐score over individual CNN backbones. These findings demonstrate that the notable improvements obtained via ensemble learning reflect the strong capability of the pruned networks, which maintain excellent predictive behavior even with their compressed architectural design.