Quantum Machine Learning for Water Pollution Profiling in the Rio Santiago Basin
Alan Abraham-Mexicano, Carlos V. Muro-Medina, Valentin Flores-Payan, Elisa Ramos-Pinzon, Carolina L. Recio-Colmenares, Roxana B. Recio-Colmenares, Cesar A. Garcia-GarciaThe Rio Santiago basin is one of the most environmentally stressed river systems in Mexico, with persistent organic, nutrient, microbial, surfactant, and metal contamination. This study develops a near-term quantum machine learning workflow for environmental monitoring and water-pollution profiling using multivariate records from 13 stations between 2009 and 2022. QML is evaluated here because quantum feature maps can define nonlinear, interaction-rich kernels that remain executable on present quantum hardware, providing an alternative representation to compare with classical PCA, RBF, UMAP, and HDBSCAN baselines rather than a presumed computational advantage. After quality screening, log transformation, standardization, and domain-guided feature selection, pollution profiles are evaluated across PCA, RBF spectral clustering, UMAP/KMeans, UMAP/HDBSCAN, a simulated ZZ-style quantum feature-map kernel, and Qiskit Runtime hardware evaluations of the same kernel concept. The initial cleaned-data results show that classical PCA clustering identifies broad lower-load, high organic/surfactant, and rain-season solids/microbial profiles. UMAP/HDBSCAN provides the strongest cleaned full-sample nonlinear baseline, with a silhouette score of 0.568 after excluding 177 noise samples. The simulated quantum-kernel representation separates station-linked gradients, while matched n = 650 stability diagnostics show near-identical quantum-kernel clustering across random initializations (mean ARI = 0.994 for cleaned data) but retain the RBF kernel as the strongest nonlinear comparator. Two 24-sample Qiskit hardware runs and two matched 8-record hardware checks provide proof-of-execution evidence. The analysis is framed as a controlled representation study, not as a claim of quantum advantage.