SkyPin: Benchmarking Target Geo-Localization from UAV Imagery on 2.5D Maps
Zhaochen Wang, Rouwan Wu, Yuxiang Liu, Yudong Huang, Shen Yan, Maojun ZhangAccurate geolocalization of ground targets from unmanned aerial vehicles (UAVs) is critically limited by pose estimation errors and the scarcity of active ranging sensors. To address these challenges, we propose a pipeline that integrates reference image cropping, robust cross-view matching, and geographic projection to estimate real-world coordinates using 2.5D reference maps. For evaluation, we introduce SkyPin, the first large-scale benchmark of its kind, designed to comprehensively test UAV-based localization methods. It comprises UAV imagery from eight diverse environments, featuring both visible and thermal infrared modalities under a wide range of conditions, including variations in weather, time of day, flight altitude, and camera perspective. All ground targets are annotated with centimeter-accuracy Real-Time Kinematic (RTK) coordinates. We establish a comprehensive benchmark by evaluating a series of feature matching methods combined with different projection strategies, allowing systematic comparison of algorithm performance. Representative results show that RoMa combined with PnP-based raytracing achieves the best overall performance, reaching a median 2D error of 0.87 m and Recall@5m values of 0.94 and 0.98 on RGB and thermal infrared UAV-map settings, respectively. Further analysis reveals that performance degrades in challenging mountainous scenes and under large viewing-angle variations, highlighting terrain relief and UAV perspective changes as remaining critical challenges for robust target geo-localization. The full dataset and implementation code will be made publicly available to facilitate future research in UAV-based geolocalization.