MDC-MobileNetV3: A Lightweight Multi-Scale Hierarchical Attention Network for Remote Sensing Scene Classification
Haonan Liu, Xiao Wang, Jialong Sun, Xingchi Yang, Zhilong WangRemote sensing scene classification remains challenging due to substantial object-scale variations, complex background interference, and high inter-class similarity. To address these issues, a lightweight classification framework, termed MDC-MobileNetV3, is proposed based on the MobileNetV3-Large backbone. The framework integrates a Multi-Scale Feature Extraction (MSFE) module for capturing spatial information at different receptive fields, a Dynamic Feature Weighted Fusion (DFWF) mechanism for adaptive feature recalibration, and the hierarchical CBAM attention strategy to enhance discriminative region representation. The model achieved high classification accuracies of 99.52%, 91.54%, 96.48%, 97.35%, 92.43%, and 99.72% on the UC Merced, WHU-RS19, NWPU-Resisc45, AID, CLRS, and PatternNet benchmark datasets, respectively, validating the effectiveness of the proposed framework, while maintaining a lightweight architecture with approximately 4.35 M parameters. In addition, Grad-CAM visualizations indicate that the model effectively focuses on semantically meaningful regions and suppresses irrelevant background information. The results confirm that the proposed framework provides a favorable trade-off between classification accuracy, model lightweight design, and model interpretability for remote sensing scene understanding.