Context-Aware Identity Prediction for Anti-UAV Multi-Object Tracking in Remote Sensing Videos
Bin Li, Tianyi Hu, Wenbo Wu, Jianming HuAnti-UAV multi-object tracking in remote sensing videos is challenging because UAV targets are small, weakly textured, and often affected by cluttered backgrounds, abrupt motion, occlusion, and intermittent visibility. To address these challenges, we formulate anti-UAV multi-object tracking as a context-aware identity prediction task, in which target identities and locations are inferred from historical trajectory priors instead of current-frame observations alone. Under this formulation, we propose a dual-track parallel tracking framework. The adaptive identity disambiguation (AID) module combines motion cues with appearance features according to their estimated reliability, improving short-term association when visual evidence is weak. In parallel, the motion-evolution temporal memory (METM) module models trajectory dynamics using motion anomaly detection and time-decayed memory, enabling spatiotemporal recovery after occlusion, temporary disappearance, or abrupt motion. The outputs of the two branches are integrated by a unified identity decision layer to produce stable tracking results. Experiments are conducted on the public 4th Anti-UAV Benchmark Track-3 and our newly constructed Anti-UAV Multi-Object Tracking dataset, AU-MOT. On the 4th Anti-UAV Benchmark Track-3, our method achieves 63.6% HOTA and 64.1% IDF1, outperforming the strongest competing method by 3.5% and 3.9%, respectively, while reducing identity switches and track fragments by 20.8% and 23.8%. On AU-MOT, it achieves 67.2% HOTA and 67.8% IDF1, with 20.2% fewer identity switches and 22.3% fewer track fragments. These results demonstrate its effectiveness under long-range observation, weak target appearance, cluttered backgrounds, abrupt motion, and intermittent target visibility.