Application of artificial intelligence for nutrient estimation in surface water bodies of basins with intensive agriculture
José Luis Medina-Jiménez, Leonel Ernesto amabilis-Sosa, Kimberly Mendivil-García, Luis Alberto Morales-Rosales, Víctor Alejandro Gonzalez-Huitrón, Héctor Rodríguez-RangelAbstract
Eutrophication is one of the most relevant concerns due to the risk to water supply and food security. Nitrogen and phosphorus chemical species concentrations determined the risk and magnitude of eutrophication. These analyses are even more relevant in basins with intensive agriculture due to agrochemical discharges. However, analyzing these nutrients is labor-intensive as sampling to inter-calibration in the laboratory requires considerable financial and human resources. Currently, artificial intelligence allows the modeling of phenomena and variables in various fields. This research focuses on the exploration of other machine learning (ML) methods, including multilayer perceptron (MLP), k-nearest neighbor (KNN), convolutional neural network (CNN), and random forest (RF) for the estimation of nutrients in surface waters of Sinaloa, Mexico (11 model basins), the states with the highest exports of agricultural products. Nutrients were considered in all possible chemical forms, such as total nitrogen, Kjeldahl nitrogen, ammonia nitrogen, total phosphorus, and orthophosphate. For estimation, the selected input parameters are characterized by pH, dissolved oxygen, conductivity, water temperature, and total suspended solids, which do not require chemical reagents and can be measured in real-time. The parameter information was obtained from the National Network for Water Quality Monitoring database (6200 data recorded since 2012). Finally, hyperparameter normalization and optimization (HPO) methods were implemented to maximize the best-performing model. Each model obtained different coefficient of determination values (R2): MLP between 0.64 and 0.77, CNN from 0.65 to 0.76, KNN from 0.64 to 0.79, and RF from 0.79 to 0.85. The latter is considered the best performer, with values of 0.95 in training and 0.94 in validation after applying HPO. Notably, the models are valid for any surface water body and in any climatic season in the state of Sinaloa, México. Therefore decision-makers can use them for science-based environmental regulation of land use and pesticide application.