DOI: 10.3390/drones10070483 ISSN: 2504-446X

A High- and Low-Level Decoupled Reinforcement Learning Method for Multi-UAV Cooperative Search

Jianjie Qiu, Yichao Cai, Hao Li, Lei Ni, Kai Yuan, Siyuan Cui

Multi-UAV cooperative search with static unknown targets requires both efficient regional allocation and responsive local maneuvering. However, single-level learning methods often suffer from redundant coverage, unclear division of labor, and unstable training. This paper proposes a high- and low-level decoupled reinforcement learning method for multi-UAV cooperative search. The high level periodically generates UAV-specific regional goals from visitation maps, target-existence belief maps, and UAV positions, while a spatial self-attention module enhances the representation of unvisited regions, high-belief target areas, and UAV distributions. The low level performs discrete steering actions based on local observations and high-level contexts, supported by a structured reward that encourages coverage, target discovery, goal-oriented progress, repeated-visit suppression, and boundary-safe motion. Simulation experiments are conducted in a two-dimensional grid environment with static targets and ideal sensing. Under this simplified simulation setting, the proposed method achieves higher training return and coverage rate than representative baseline algorithms while maintaining a high final target discovery rate and reaching the discovery threshold earlier. Ablation and visualization results further demonstrate the effectiveness and interpretability of the proposed hierarchical guidance mechanism within the considered simulation scenario.

More from our Archive