DOI: 10.3390/en19122936 ISSN: 1996-1073

Multi-Source Meteorological–Topographic Modeling of Monthly Power Generation for Mountain Photovoltaic Stations Using Gradient-Boosted Trees

Pengjie Sun, Ming Wang, Dan Meng, Yang Xu, Chi Cheng, Wei Ju

Mountain photovoltaic (PV) stations are increasingly deployed in complex terrain, where generation is jointly controlled by solar-resource variability, near-surface meteorology, and local topography. However, the quantitative contribution of topographic factors to regional-scale PV generation remains insufficiently evaluated, and many prediction studies rely on single-station or short-term records. In this study, monthly measured generation from 118 standardized village-level mountain PV stations in Badong County, western Hubei Province, China (2019–2021), was integrated with Solargis Global Horizontal Irradiance (GHI)-related solar-resource data, high-resolution gridded meteorological data, a 25 m digital elevation model, seasonal-cycle variables, and historical-generation features. After seasonally grouped median-absolute-deviation (MAD) outlier screening, GIS-based spatial matching, terrain extraction, and viewshed-derived shading analysis, regression models and climatology baselines were compared under both chronological validation and station-exclusion spatial cross-validation. Under the strict chronological validation, CatBoost achieved the best temporal performance among the tested models (R2 = 0.3119, MAE = 2719.7 kWh, RMSE = 3245.6 kWh), slightly outperforming the monthly climatology baseline. In the station-exclusion spatial cross-validation, XGBoost achieved the highest mean R2 (0.8659), indicating good spatial transferability to unseen stations. Correlation and partial-correlation analyses showed that the temperature-related variable group and monthly radiation were the dominant meteorological controls, whereas elevation, slope, and terrain shading showed weak direct correlations with monthly generation for already-sited stations. Annual 90% prediction intervals were further estimated using residual bootstrapping, with an empirical coverage of 94.9%. The proposed framework provides a practical basis for monthly generation forecasting and operational assessment of already-built distributed PV stations in mountainous regions, while its application to greenfield site selection requires additional site engineering and near-field obstruction information.

More from our Archive