Groundwater quality assessment is essential for sustainable water resource management, particularly in regions experiencing intensive anthropogenic pressures. In this study, we developed a data-driven framework that integrates groundwater quality indices, PSO-optimized XGBoost modeling, SHAP-based interpretability, and ensemble-based quantile uncertainty quantification. This framework enables simultaneous prediction, interpretation, and rigorous uncertainty analysis, and was applied to the Ganfu Plain, China, to predict both the water quality index (WQI) and entropy-weighted water quality index (EWQI) using hydrochemical and spatial parameters. The models achieved strong predictive performance, with nested cross-validation yielding low RMSE values across most folds. SHAP analysis consistently identified Mn, NO ₃\, ^-, and spatial coordinates as the most influential predictors, whereas conventional bulk parameters such as total hardness and sulfate contributed minimally, particularly in the EWQI model. Uncertainty analysis showed that predictive variance was dominated by aleatoric contributions, while epistemic variance remained comparatively small but informative of data scarcity. Predictive intervals were wider for EWQI, reflecting its greater sensitivity to key parameters and its ability to capture localized spatial heterogeneity. Overall, the framework provides a robust, interpretable, and scalable tool for groundwater quality evaluation and sustainable management under data-limited conditions.
Ma et al. (Thu,) studied this question.