A Lightweight Multi-Modal Attention Fusion Network for Guided Depth Map Super-Resolution
Zehui Xiao, Xuyang Tan, Xianhong Wen, Xiangyuan Zhu, Kehua GuoExisting guided depth map super-resolution methods have achieved promising results, but most of these methods suffer from large model parameters and high computational complexity, limiting their practical applicability. To address these challenges, this paper proposes a lightweight guided depth map super-resolution framework based on the Multi-Modal Attention Fusion Network (MAFNet). Specifically, we design a two-branch architecture for MAFNet, wherein the Swin branch captures global dependencies and the CNN branch extracts detailed local features, thus ensuring efficient processing. Furthermore, we propose a triple cross-modal attention module for effective fusion of color and depth information, and a compression selectivity bottleneck module to selectively filter essential features. Experimental evaluation on multiple public datasets demonstrates that the proposed MAFNet achieves a compelling balance between model performance and computational efficiency, offering a robust solution for guided depth map super-resolution.