DOI: 10.3390/horticulturae12070805 ISSN: 2311-7524

A Lightweight Task-Adaptive YOLO for Tomato Ripeness Detection in Complex Orchard Environments

Jieyuan Ding, Yunhan Zou, Yu Wang, Lianjie Han, Yawen Xiao, Ruihong Zhang, Xiaobo Xi

Accurate ripeness assessment of tomatoes in natural orchard settings is challenged by severe occlusion, dense clustering, and scale variation among fruits. This paper introduces YOLOv8n−TiD, a lightweight detection framework designed to overcome these obstacles. The architecture enhances the YOLOv8n backbone with a novel C2f-iRD module, which integrates re-parameterized dilated convolutions and inverted residual blocks to enlarge the receptive field while retaining fine-grained texture details. For feature fusion, we propose C2f-iRMB in the neck to ensure cross-scale consistency. To address semantic drift during upsampling, we replace standard interpolation with the DySample operator. The detection head is re-engineered as TADAH (Task-Adaptive Dynamic Alignment Head), enabling genuine task decoupling via deformable convolutions and conditionally shared features. Additionally, we introduce F−PIoUv2, a regression loss that emphasizes medium-quality predictions and curbs excessive bounding box expansion. Evaluations on a custom dataset show that YOLOv8n−TiD cuts parameters by 45%, FLOPs by 19%, and model size by 41%, while raising mAP@0.5 by 1.9 points—all in real time. On Android devices, it sustains 30 FPS inference and generalizes effectively to a distinct cherry tomato cultivar. These findings confirm the method’s robustness in discriminating occluded, small, and visually similar maturity stages, providing a practical vision system for robotic harvesting and field-based grading.

More from our Archive