DOI: 10.3390/geomatics6040073 ISSN: 2673-7418

Spatiotemporal Modeling and Uncertainty Quantification of Reference Evapotranspiration Using Machine Learning and Bayesian Model Averaging in Benin

Bienvenue Christela Finounou Mizele, Modeste Meliho, Vinasetan Ratheil Houndji, Semevo Arnaud R. M. Ahouandjinou, Collins A. Orlando

Reference evapotranspiration (ET0) represents the atmospheric demand for water from a well-watered vegetated surface and is a key component of the hydrological cycle and agricultural water management. This study evaluated the performance of seven machine learning (ML) models: linear regression (LR), Random Forest (RF), Support Vector Machines (SVM), K-Nearest Neighbors (KNN), Extreme Gradient Boosting (XGBoost), Decision Trees (DT), and Cubist, for predicting monthly FAO-56 Penman–Monteith ET0 in Benin. The target variable was calculated from data collected at six synoptic stations over the 2017–2021 period. Ten remote-sensing and topographic predictors were used: MODIS Land Surface Temperature (LST), six Sentinel-2 optical vegetation indices (NDVI, EVI, NDMI, NDWI, MSI, NDRE), elevation, and cyclic month encoding. Models were trained on the 2017–2019 period and evaluated on an independent temporal test set (2020–2021). All models showed positive predictive performance, with the BMA ensemble achieving the highest accuracy (RMSE = 7.0% of mean ET0, R2 = 0.802), followed by Cubist (RMSE = 7.3%, R2 = 0.787) and DT (RMSE = 7.5%, R2 = 0.776). The seven models were combined via Bayesian Model Averaging (BMA) with posterior weights estimated by the EM algorithm to produce 1 km monthly ET0 maps for Benin for 2025. BMA-derived inter-model standard deviation provided spatially explicit uncertainty estimates, revealing that prediction uncertainty is greatest in the northern Sudanian zone during the dry season. The ET0 target variable was constructed as a hybrid product combining station temperature observations with solar radiation, wind speed, and vapor pressure deficit extracted from the TerraClimate gridded reanalysis dataset; this methodological choice is discussed as a study limitation.

More from our Archive