DOI: 10.3390/en19133147 ISSN: 1996-1073

Multi-Agent Deep Reinforcement Learning for Dynamic Cost Overrun Mitigation in Smart Grid Construction Projects

Yongjie Li, Xin Niu, Peng Li, Hua Liu, Ruoxi Dong, Nan Li, Zhongfu Tan

This study develops a cooperative multi-agent deep reinforcement learning (MARL) framework for simulation-based cost-overrun mitigation in smart grid construction projects under dynamic engineering uncertainty. Modern smart grid construction involves digital substations, renewable-energy-connected facilities, flexible transmission assets, intelligent monitoring systems, and geographically distributed contractors; therefore, cost escalation is driven by sequential interactions among procurement, schedule execution, equipment deployment, supervision, weather, logistics, and price volatility. The proposed framework models procurement management, construction scheduling, equipment allocation, and supervision-control units as decentralized agents embedded in a calibrated construction simulation environment. The environment is parameterized from 42 smart grid construction projects in Henan Province, China and generates disturbance scenarios involving weather efficiency loss, transportation delay, market-price volatility, labor shortage, and supply-chain interruption. A hybrid DQN–PPO mechanism represents mixed decision structures: value-based DQN modules handle discrete managerial choices such as task acceleration, supplier switching, and procurement timing, whereas PPO modules adjust continuous resource-allocation and recovery-intensity decisions. A hierarchical reward function combines local departmental objectives with project-level penalties for cost overrun, schedule delay, idle resources, recovery expenditure, safety risk, and environmental impact. The experimental protocol uses 30 paired random seeds, nonparametric bootstrap confidence intervals, Holm-adjusted Wilcoxon signed-rank tests, and comparison with deterministic optimization, rolling-horizon MPC, stochastic/robust optimization, single-agent DRL, MAPPO, MADDPG/MATD3, QMIX, and HAPPO baselines. The proposed framework achieves a mean cost-overrun rate of 6.83% and a mean schedule deviation of 16.82 days, reducing cost overrun by 18.7% and schedule deviation by 21.4% relative to rule-based construction management under the reported disturbance settings. The calibrated simulation evidence establishes a statistically evaluated decision-support framework for coordinated construction cost control and provides an artifact-level reproducibility pathway through configuration files, random-seed lists, anonymized synthetic benchmarks, and aggregated logs.

More from our Archive