We are looking at the surface waters of the Inaouen watershed. The dynamics of iron in this area are influenced by physico-chemical parameters that are connected to each other. This connection leads to a problem called multicollinearity. It makes traditional regression models not reliable. To solve this issue we compare three penalized regression techniques: Ridge, Lasso and Elastic Net. We apply these techniques to a dataset that includes the concentrations of HCO3−, CaCO3, Mg2+, Na+, K+ Cl−, Ca2+, SO42− and Fe. We use some indicators to assess the performance of the models. These indicators are root mean squared error, coefficient of determination mean absolute error and the AIC and BIC information criteria. The results show that the Lasso model is the best. It has a root mean squared error of 2.52 a coefficient of determination of 0.84, an absolute error of 1.98, an AIC of 285.12 and a BIC of 293.84. The Lasso model is good at identifying the important variables like Ca2+, Na+ and K+. It also eliminates the variables that are not needed. The Elastic Net model is similar to the Lasso model. However the Ridge model is not as good as the two models.
Chaal et al. (Thu,) studied this question.