Scale-Adaptive Infrared UAV Detection Under Fast Motion and Zooming
Xingwei Yan, Yan Zhang, Haiyong Chen, Yaxiu Zhang, Kunlin ZouInfrared UAV detection plays a crucial role in both security surveillance and military applications. However, under fast UAV movement or dynamic zooming scenarios, the rapid scale variation of targets poses severe challenges to existing detection models, especially on resource-constrained edge devices. To address this, a lightweight scale-adaptive multi-scale feature fusion model, termed LMF-IR, is proposed for efficient and accurate detection under sudden target size changes. The model integrates three key components: a Multi-Dilation Residual Block (MDRB) for enhanced multi-scale feature representation, an improved Channel Attention Model–Feature Fusion Pyramid Network (CAM-FPN) to boost adaptive feature fusion, and a modified P-WIoU loss function designed for precise bounding box regression under varying target sizes. The MDRB module effectively captures fine-grained features across multiple scales and reliably identifies targets of varying sizes. The CAM-FPN incorporates a channel attention mechanism, which can dynamically adjust the weights of features, enabling the model to focus on informative feature channels. The redesigned P-WIoU loss function is designed to account for the shape characteristics of UAV target bounding boxes. It includes centroid distance, overlap ratio, and aspect ratio, thereby improving localization accuracy under rapid scale changes. The experimental results on our self-built UAV–infrared dataset show that LMF-IR reduces 1.4 G in floating-point operations compared to the baseline model, and the parameter count is reduced to 62% of the baseline. At the same time, mAP@0.5:0.95 increases by 2.4%. Moreover, on the public ANTI-UAV dataset, our method increases mAP@0.5:0.95 by 4.8%, indicating that our method has excellent performance in real-time infrared UAV detection under rapid target scale changes.