DOI: 10.3390/rs18122036 ISSN: 2072-4292

A Target Tracking Method Based on Frequency and Spatial Information Perception in UAV Vision

Chenyang Li, Zhiheng Liu, Suiping Zhou

Target tracking for Unmanned Aerial Vehicles (UAVs) can be significantly impacted by environmental factors such as lighting variations, background clutter, and target occlusion. To address these challenges, we developed a target tracking method that integrates both frequency-domain and spatial perception capabilities in UAV vision (FSTrack). Specifically: (1) we utilized the Swin Transformer as the core network to extract features from both the template and search images; (2) we introduced a Transformer-based module to enhance both frequency and spatial information, improving tracking accuracy under varying illumination conditions; (3) we designed a spatio-temporal feature fusion module with multiple multi-head self-attention mechanisms to precisely model the tracking state, thus increasing reliability in cluttered and occluded environments; and (4) we created a hybrid loss function to boost accuracy in both classification and regression tasks. Our experimental results on the UAV123, DTB70, and UAVDT datasets show that our approach not only surpasses current state-of-the-art methods in success rates and precision but also operates more swiftly.

More from our Archive