A Cooperative Multi-Agent QTRAN Framework for Artificial Intelligence-Driven Cognitive V2X in the Internet of Vehicles
Ramzi Bouzoubia, Sofiane Zaidi, Lazhar Khamer, Mostafa Ogab, Carlos T. CalafateResource allocation for cognitive Vehicle-to-Everything (V2X) networks is challenging due to dynamic spectrum sharing, strong interference coupling, and stringent latency constraints for safety-critical Vehicle-to-Vehicle (V2V) traffic. Although recent Multi-Agent Reinforcement Learning (MARL) approaches report promising gains, many evaluations are conducted at limited and fixed network scales, which restricts insights into scalability under dense spectrum reuse. This paper investigates cooperative multi-agent learning for interference-aware and deadline-constrained V2X resource management. We propose a Q-value Transformation (QTRAN)-based value decomposition framework under centralized training with decentralized execution (CTDE) for joint resource-block and power allocation among V2V agents. The proposed approach is implemented in a realistic V2V/V2I simulator incorporating Manhattan grid mobility, fast fading, explicit cross-tier and co-channel interference, and per-link payload/deadline dynamics. Beyond communication-level performance, improved timely delivery of V2V safety messages can support cooperative maneuvering, collision avoidance, platooning, and infrastructure-assisted traffic management. Extensive simulations across varying numbers of V2V agents benchmark QTRAN against independent learning baselines including MARL and centralized single-agent learning (SARL). Results show that QTRAN improves performance compared with the selected learning baselines and enhances the throughput–reliability trade-off under interference-coupled spectrum reuse. For instance, at NV2V=20, QTRAN achieves a V2V rate of 0.194±0.004 and a V2I rate of 9.117±0.213, while reaching a V2V success rate of 0.812±0.017 with a low Deadline Miss Ratio of 0.001±0.000. At higher density (NV2V=50), QTRAN sustains strong reliability (V2V success rate of 0.719±0.006 and Completion Ratio of 0.716±0.006) while maintaining competitive infrastructure throughput (V2I rate of 9.251±0.114). These results indicate that QTRAN effectively captures non-linear interference interactions, enabling coordinated decentralized spectrum and power decisions under the adopted density-based evaluation setting, thereby enhancing V2V reliability and throughput in cognitive Internet of Vehicles.