DOI: 10.3390/app16136570 ISSN: 2076-3417

An Adaptive Scheduling Algorithm Integrating Hierarchical Reinforcement Learning and Semi-Markov Decision Processes

Feng Wang, Bingwei Ding, Fangchao Tian, Zhaohua Guo, Wenshuo Ma

Coordinating multiple unmanned aerial vehicle (UAV) systems under strict energy and temporal constraints remains a complex scheduling problem. Existing reinforcement learning methods typically rely on fixed-time-step modeling, which struggles to accommodate flight actions of varying durations and often leads to temporal mismatches between task planning and physical execution. To address this limitation, we propose an Adaptive Hierarchical Semi-Markov Decision Process (AH-SMDP) framework. This architecture decouples task allocation from execution by modeling variable-length actions via an SMDP. An event-driven synchronization mechanism is introduced to align the swarm’s decision-making rhythm with actual task completion times. Additionally, a state-aware reward formulation and a dynamic action space pruning strategy are designed to help UAVs balance energy efficiency with deadline compliance. Simulation results in multi-constraint environments demonstrate that the AH-SMDP framework effectively improves scheduling performance compared to standard MAPPO and PPO algorithms. Under the evaluated experimental settings, the proposed method yields improvements of approximately 30% in average task completion rate, 40% in energy reduction, and 60% in convergence stability. Ablation studies further suggest that this integrated framework offers a viable and effective approach for multi-UAV scheduling.

More from our Archive