This study evaluates the performance of several machine learning models in predicting dissolved oxygen concentration in the surface layer of the Mar Menor coastal lagoon. In recent years, this ecosystem has suffered a continuous process of eutrophication and episodes of hypoxia, mainly due to continuous influx of nutrients from agricultural activities, causing severe water quality deterioration and mortality of local flora and fauna. In this context, monitoring the ecological status of the Mar Menor and its watershed is essential to understand the environmental dynamics that trigger these dystrophic crises. Using field data, this study evaluates the performance of eight predictive modelling approaches, encompassing regularised linear regression methods (Ridge, Lasso, and Elastic Net), instance-based learning (k-nearest neighbours, KNN), kernel-based regression (support vector regression with a radial basis function kernel, SVR-RBF), and tree-based ensemble techniques (Random Forest, Regularised Random Forest, and XGBoost), under multiple experimental settings involving spatial variability and varying time lags applied to physicochemical and meteorological predictors. The results showed that incorporating time lags of approximately two weeks in physicochemical variables markedly improves the models’ ability to generalise to new data. Tree-based regression models achieved the best overall performance, with eXtreme Gradient Boosting providing the highest evaluation metrics. Finally, analysing predictions by sampling point reveals spatial patterns, underscoring the influence of local conditions on prediction quality and the need to consider both spatial structure and temporal inertia when modelling complex coastal lagoon systems.
Lorente-González et al. (Fri,) studied this question.