DOI: 10.3390/w17010128 ISSN: 2073-4441

A Methodology Based on Random Forest to Estimate Precipitation Return Periods: A Comparative Analysis with Probability Density Functions in Arequipa, Peru

Johan Anco-Valdivia, Sebastián Valencia-Félix, Alain Jorge Espinoza Vigil, Guido Anco, Julian Booker, Julio Juarez-Quispe, Erick Rojas-Chura

Precipitation within specific return periods plays a crucial role in the design of hydraulic infrastructure for water management. Traditional analytical approaches involve collecting annual maximum precipitation data from a station followed by the application of statistical probability distributions and the selection of the best-fit distribution based on goodness-of-fit tests (e.g., Kolmogorov-Smirnov). However, this methodology relies on current data, raising concerns about its suitability for outdated data. This study aims to compare Probability Density Functions (PDFs) with the Random Forest (RF) machine learning algorithm for estimating precipitation at different return periods. Using data from twenty-six stations located in various parts of the Arequipa department in Peru, the performance of both methods was evaluated using MSE, RMSE, R2 and MAE. The results show that RF outperforms PDFs in most cases, having more precision using the metrics mentioned for precipitation estimates at return periods of 2, 5, 10, 20, 50, and 100 years for the studied stations.

More from our Archive