Identification of the Picking Stage for Volvariella Volvacea Fruiting Bodies Using an Improved YOLO11n Model
Haitao Yin, Jinpeng Wang, Bin Zhou, Yongqi Chao, Hongping ZhouAccurate and rapid detection of Volvariella volvacea (straw mushroom) fruiting bodies at harvestable maturity is a critical prerequisite for automated industrial cultivation. However, existing detection methods often yield high false-negative and false-positive rates when processing a small-scale, densely distributed, and heavily occluded targets against complex straw substrate backgrounds. Furthermore, these methods frequently struggle to balance the competing requirements of architectural efficiency (such as parameter volume and computational complexity) and real-time performance for edge computing. To address these challenges, this study proposes a YOLO11n-CPDM, a lightweight detection model based on an improved YOLO11n architecture. The model incorporates synergistic optimizations across feature extraction, fusion, and reconstruction. First, a Dual Coordinate Attention Feature Extraction mechanism is integrated into the C3k2 bottleneck blocks of the backbone network. This enhances target perception in complex, occluded environments by concurrently modeling global context and local salient features. Second, within the neck network, the standard attention module is replaced with the PnPNystraAttention module, coupled with the DySample dynamic upsampling operator. This modification strengthens contextual relationships among multi-scale features and improves spatial consistency during reconstruction while preserving linear computational complexity. Finally, the detection head is optimized using MBConv blocks based on an inverted residual structure to minimize parameter volume. Experimental results on a custom V. volvacea dataset demonstrate that the proposed YOLO11n-CPDM model achieves significant performance gains, with Precision (P), Recall (R), and Mean Average Precision (mAP50) reaching 86.8%, 87.5%, and 88.4%, respectively. These figures represent improvements of 2.7, 3.0, and 3.2 percentage points over the baseline YOLO11n model. Additionally, the model size is reduced to 4.8 MB (a 12.7% decrease), while achieving inference speeds of 42.7 FPS on Jetson AGX Orin and 21.2 FPS on Jetson Nano, outperforming the baseline model on both embedded platforms. Consequently, the proposed model effectively enhances detection performance in complex environments while maintaining excellent lightweight characteristics and deployment flexibility, providing a solid technical foundation for intelligent perception and automated harvesting of V. volvacea.