The Rímac River, located in Lima, Peru, is a crucial water source for the city; however, its contamination by heavy metals, particularly lead, poses a significant environmental risk. Several studies have reported lead concentrations exceeding the limits established by water quality regulations, highlighting the need for accurate predictive models to monitor its evolution. This study aims to forecast the average lead concentration in the Rímac River using time series models, such as ARIMA, SARIMA, and exponential smoothing methods, implemented in Python. Monthly lead concentration data from January 2020 to June 2024 were analyzed. The results indicate that the Grid Search ARIMA model provides the highest predictive accuracy, with the lowest Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) values, as well as forecasts that closely align with observed real values. In contrast, exponential smoothing models (Holt, SES, and Holt Damped) exhibited inferior performance, with higher errors and a limited ability to capture the time series structure. These findings underscore the importance of employing advanced models in water quality management, enabling the implementation of preventive and corrective strategies to mitigate the environmental risks associated with the contamination of the Rímac River.
Zevallos-Aquije et al. (Fri,) studied this question.