GDNet: A Robust 2.5D Multimodal MRI Brain Tumor Segmentation Framework with EMA Stabilization and Tumor-Aware Sampling
Behnam Kiani Kalejahi, Sajid Khan, Mohammad Javad RajabiAccurate, automated delineation of adult diffuse gliomas from multi-parametric magnetic resonance imaging (mpMRI) is central to quantitative neuro-oncology. Volumetric 3D networks dominate the BraTS leaderboard but require expensive GPUs, long training cycles, and provide diminishing returns relative to their compute budget. Slice-wise 2D models, by contrast, discard inter-slice context that is informative for thin tumor rims and small enhancing foci. We introduce GDNet, a 2.5D multimodal MRI segmentation framework for adult glioma evaluated on the BraTS 2024 cohort. GDNet consumes a stack of three adjacent axial slices from the four standard BraTS modalities (T1, T1ce, T2, FLAIR) as a 12-channel input to a compact U-shaped encoder–decoder with Group Normalization and predicts whole tumor (WT), tumor core (TC), and enhancing tumor (ET) masks for the central slice. The training pipeline pairs the 2.5D backbone with: (i) Exponential Moving Average (EMA) of model weights with decay 0.999, (ii) mixed tumor-aware slice sampling (p_tumor = 0.50), (iii) a compound Cross-Entropy + Soft-Dice loss, and (iv) AdamW with warm-up plus cosine annealing under Automatic Mixed Precision. We performed a systematic, step-by-step ablation covering a 2D baseline, EMA + mixed sampling, tumor-centered crop fine-tuning, a GDNet-inspired architectural integration, a region-aware loss, 3-slice and 5-slice 2.5D inputs, and connected-component post-processing, and we report multi-seed results to quantify reproducibility. On the held-out BraTS 2024 test partition, the final 3-slice 2.5D GDNet achieved positive-only Dice scores of 0.791 ± 0.000 (WT), 0.736 ± 0.003 (TC), 0.654 ± 0.004 (ET), and a mean foreground positive-only Dice of 0.820 ± 0.000 across seeds; the all-slice mean foreground Dice exceeded 0.927 ± 0.000. Validation positive-only scores were 0.805 ± 0.002 (WT), 0.757 ± 0.004 (TC), 0.683 ± 0.009 (ET). The inter-seed standard deviation was small for every region (≤0.01 Dice points), indicating low inter-seed variance across the two seeds evaluated; with only two seeds, we regard this as preliminary evidence of training stability rather than a strong reproducibility claim. The ablation isolated EMA + mixed tumor sampling and the 2.5D context window as the dominant sources of improvement; notably, a GDNet-style architectural integration with a region-aware loss did not outperform the simpler 2.5D U-Net on positive-only WT/TC/ET, and light post-processing improved only all-slice Dice. A failure-mode audit found that the residual catastrophic predictions are concentrated on a small minority of diffuse, infiltrative tumors with mass effect. Conclusions: Carefully engineered training strategies, tumor-aware sampling, EMA stabilization, and a modest 2.5D context window recover a substantial fraction of the accuracy of much heavier 3D networks at a fraction of the compute, are reproducible across seeds, and outperform a heavier GDNet-inspired architectural variant on the same data. GDNet is therefore a practical and, pending external validation, potentially clinically deployable framework for multimodal glioma segmentation on workstation-class GPU hardware.