Short-Term Load Forecasting Based on Scene Clustering and Transformer–BiGRU–Attention
Qinglei Zhang, Yao Wang, Ying ZhouTo address the insufficient accuracy of short-term load forecasting caused by the strong randomness of distributed energy output, variable electricity consumption patterns, and complex meteorological factors, this study proposes a load forecasting method that integrates K-means scene clustering and a Transformer–BiGRU–Attention (CTBA) hybrid deep learning architecture. Different from conventional Transformer–BiGRU hybrid forecasters that train a single global predictor across all operating conditions, the proposed CTBA framework first partitions daily load curves into representative scenes and then routes each sample to a scene-specific Transformer–BiGRU–Attention predictor, thereby reducing distributional heterogeneity before temporal modeling. First, the K-means algorithm is used to perform scene clustering on historical daily load curves, and the optimal number of clusters is selected according to the silhouette coefficient and downstream prediction performance. Subsequently, the CTBA model is trained separately for each clustering subset. The Transformer encoder captures the long-range global dependencies of load sequences through the self-attention mechanism, the BiGRU module extracts local bidirectional temporal fluctuation features, and the Attention mechanism further focuses on key time nodes such as morning and evening peaks while fusing multi-source data including historical load, day-ahead electricity price, and multi-dimensional meteorological factors. Experimental results based on the German ENTSO-E power dataset show that the coefficient of determination R2 of the proposed model reaches 0.9893, with MAE, RMSE, and MAPE as low as 0.0141, 0.0187, and 3.92%, respectively, which are significantly improved compared to benchmark models such as SVR, LSTM, CNN, and TCN-BiGRU. Ablation experiments further demonstrate that removing the clustering, Transformer, BiGRU, or attention layer will degrade performance, thus verifying the effectiveness and superiority of the method in short-term load forecasting and providing an accurate solution for the short-term load forecasting of power systems.