Decentralized Cooperative Power Dispatch Based on Multi-Agent Reinforcement Learning and Offline Digital Twin Technology for Building Integrated Photovoltaics and Energy Storage System Clusters
Qinwei Li, Haowei Xing, Han Zhu, Zhengrong LiUnder carbon peaking and neutrality goals, building integrated with photovoltaics and energy storage system clusters (BIPECs) enable efficient on-site renewable energy use and can act as dispatch units for the public grid. However, BIPECs face significant uncertainties and are still under development. This study proposes a decentralized cooperative power dispatch model coupling a multi-agent proximal policy optimization (MAPPO) algorithm and offline digital twin (ODT) technology to optimize the photovoltaic (PV) power consumption of clusters despite limited data availability. An integrated BIPEC energy system model is established, and by leveraging the multi-agent system model of the BIPEC, the decentralized dispatch problem is converted into a fully cooperative multi-agent reinforcement learning (MARL) problem. A simulation-assisted ODT framework constructs a digital environment for MAPPO to augment data, conduct MAPPO training, and optimize the reward function, thereby obtaining power dispatch strategies. The results show that the proposed optimization model can obtain dispatch strategies that reflect a high degree of collaboration, reducing the cumulative power supply from the public grid by 0.55–2.56% per month compared to the non-cooperative self-generating and self-using strategy. This study presents the application of MARL in BIPECs by introducing a decentralized collaborative power dispatch methodology for building clusters, enhancing building energy efficiency and facilitating flexible collaborative power dispatch.