Machine Learning-Driven Prediction of Quality Water in the Ebro River from the Ascó Water Quality Monitoring Station Data, Northeast Spain
Paulino José García-Nieto, Esperanza García-Gonzalo, José Ramón Alonso Fernández, Cristina Díaz Muñiz, Luis Alfonso Menéndez-GarcíaContinuous monitoring of river water quality is essential for understanding hydro-chemical dynamics and supporting adaptive water resource management. This study develops a Gaussian Process Regression (GPR) model with a Matern32 kernel, optimized via the Nelder–Mead (NM) algorithm, to predict turbidity, electrical conductivity, and UV254 absorbance in the Ebro River (Spain), without considering temporal dependencies. The Matern 32 kernel was selected for its ability to model the non-smooth, heterogeneous variability of water quality parameters, and its hyperparameters were tailored to each indicator, revealing unique spatial scales of variability. Using 46,287 high-frequency observations, the model achieves Nash–Sutcliffe Efficiency (NSE) values of 0.8956, 0.9503, and 0.9180 for turbidity, conductivity, and absorbance, respectively. Feature importance analysis reveals flow rate and nitrates as dominant drivers. The GPR framework outperforms multilayer perceptron (MLP), multivariate linear regression (MLR), and M5 model tree (M5P), demonstrating its potential for real-time water quality monitoring.