DOI: 10.3390/s26134057 ISSN: 1424-8220

An Intelligent Energy-Aware Framework for 6G-Enabled Non-Terrestrial IoT via Reinforcement Learning

Ali Nauman, Sung Won Kim

6G promises ultra-low latency, high data throughput, and seamless global connectivity. However, providing uninterrupted connectivity in remote and underserved regions remains a critical challenge for Terrestrial Networks (TNs), where the cost of deploying infrastructure is difficult to justify against sparse user density. Standardized under 3GPP Release 17, Non-Terrestrial Networks (NTNs) have emerged as a viable solution to close this digital divide. Among NTN platforms, High-Altitude Platform Stations (HAPS) occupy a strategic middle ground, as they deliver lower propagation delays than Low-Earth Orbit (LEO) satellites while achieving far broader coverage than TN-based Base Stations (BS). Despite these advantages, battery-powered Internet of Things (IoT) devices communicating via HAPS face a fundamental energy efficiency (EE) challenge: transmit power must be carefully managed to maximize data throughput while preserving battery life and minimizing packet queuing delays. To address this, we propose a Q-learning-based Reinforcement Learning (RL) framework. The RL agent observes the instantaneous battery level and queue state of the IoT device, and dynamically selects optimal power levels from a discrete action space across successive time slots. Unlike traditional heuristic algorithms, such as Round Robin (RR), Max Single-to-Noise Ratio (Max-SNR), and fixed-power allocation, which rely on static rules or greedy channel-based decisions, the proposed Q-learning agent learns adaptive, long-term optimal policies through direct interaction with the environment, without requiring explicit mathematical modeling of the channel or traffic dynamics. Extensive simulations demonstrate that the proposed framework achieves up to 40% higher average EE compared to all benchmark schemes, maintains consistently lower power consumption, and exhibits superior statistical reliability as evidenced by a right-shifted Cumulative Distribution Function (CDF) of EE. These results demonstrate Q-learning as a promising candidate for scalable, energy-aware power control of next-generation HAPS-assisted IoT deployments in 6G NTN ecosystems.

More from our Archive