CIDR-MobileNet: A Monocular Pseudo-Depth and Cross-Modal Feature Fusion Approach for Chili Pepper Above-Ground Biomass Estimation
Yi Wang, Jingtao Deng, Lin Yang, Shangjing Ruan, Weijie Wang, Wenwu Hu, Ping JiangAccurate real-time estimation of above-ground biomass is critical for intelligent chilli pepper harvesting. This study proposes CIDR-MobileNet, a lightweight end-to-end framework that addresses the limitations of destructive sampling, reliance on additional depth sensors, and weak regression robustness in existing methods. Pseudo-depth maps are generated from single-view RGB images using Depth Anything V2, providing low-cost structural information without requiring extra hardware. A cross-modal feature interaction module adaptively fuses RGB texture with pseudo-depth geometry, while a multi-branch distribution regression head models AGB prediction as a probabilistic task to improve robustness against occlusion and noise. A ranking loss is also introduced to preserve the relative order of predictions. Validated on 275 in-field chilli pepper samples via ten-fold cross-validation, the model achieves an R2 of 0.972, MAE of 174.56 g, RMSE of 230.74 g, and MAPE of 9.56%, with only 3.28 M parameters. Comparative experiments demonstrate that CIDR-MobileNet outperforms mainstream lightweight networks while maintaining high inference efficiency (10.56 ms CPU latency). The method strikes a favourable balance between prediction accuracy, hardware cost, and real-time performance, offering a practical solution for non-destructive biomass monitoring in precision agriculture.