What question did this study set out to answer?

To evaluate the effectiveness of machine learning approaches using low-frequency data for predicting biochemical oxygen demand and ammonium nitrogen in wastewater treatment.

March 8, 2026Open Access

Machine learning approaches for predicting biochemical oxygen demand and ammonium nitrogen: A decade-long weekly field study at a full-scale water resource recovery facility

Read Full Paperexternally

Key Points

To evaluate the effectiveness of machine learning approaches using low-frequency data for predicting biochemical oxygen demand and ammonium nitrogen in wastewater treatment.
Analyzed low-frequency weekly datasets from a water recovery facility.
Employed linear and nonlinear machine learning algorithms for predictions.
Utilized various feature selection methods including filters, wrappers, and embedded techniques.
Calculated root mean square error (RMSE) to assess model performance.
A multilayer perceptron achieved an RMSE of 23.50 mg/L for BOD in primary effluent.
SVR (rbf) and random forest produced an RMSE of 3.26 mg/L for BOD in secondary effluent.
Multiple linear regression and random forest gave RMSEs of 2.71 mg/L and 1.51 mg/L for ammonium nitrogen, respectively.
Demonstrated that reduced features can maintain predictive accuracy while decreasing computational complexity.

Abstract

Unlike previous studies that rely on high-frequency (15-min or hourly) datasets, this study is among the first to use low-frequency (weekly) data to evaluate the performance of linear and nonlinear machine learning (ML) algorithms for predicting biochemical oxygen demand (BOD) and ammonium nitrogen (NH 4 + -N) in the primary and secondary treatment effluents from the Subiaco Water Resource Recovery Facility (WRRF) in Western Australia. Various feature selection methods, including filters, wrappers, and embedded methods, were employed to identify the most effective approach that achieves the highest model performance while enhancing computational efficiency. The results demonstrate that a reduced set of key features can achieve comparable predictive accuracy with lower computational complexity. For BOD prediction in primary effluent, a multilayer perceptron (MLP) achieved a root mean square error (RMSE) of 23.50 mg per liter (mg/L) using features selected based on mutual information. In the secondary effluent, SVR (rbf) and random forest feature selection yielded the best predictions, achieving an RMSE of 3.26 mg/L. Similarly, for NH 4 + -N, multiple linear regression with backward elimination achieved an RMSE of 2.71 mg/L in the primary effluent. In comparison, a random forest with five key predictors achieved an RMSE of 1.51 mg/L in the secondary effluent, indicating high accuracy in NH 4 + -N prediction. These findings demonstrate that data-driven models can predict BOD and NH 4 + -N using low-frequency monitoring data, supporting supervisory-level operational decision-making in wastewater treatment plants and near-real-time wastewater quality assessment. Furthermore, generalization analysis indicates that linear models perform more consistently across multiple targets and evaluation metrics. • Application of low-frequency data for ML-based wastewater prediction. • First-time prediction of BOD and NH 4 + -N in primary treatment effluent. • Comparative evaluation of seven linear and nonlinear ML algorithms. • Systematic comparison of filter, wrapper, and embedded feature selection methods. • Linear models demonstrate superior generalization under sparse monitoring conditions.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Khoshvaght et al. (Fri,) studied this question.

synapsesocial.com/papers/69ada873bc08abd80d5bb665 — DOI: https://doi.org/10.1016/j.engappai.2026.114349

Authors

Hoda Khoshvaght

Edith Cowan University

Rizki Permala

National Research and Innovation Agency

Amir Razmjou

Edith Cowan University

Journals

Engineering Applications of Artificial Intelligence

Actions

Institutions

Edith Cowan University

Water Corporation of Western Australia (Australia)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Machine learning approaches for predicting biochemical oxygen demand and ammonium nitrogen: A decade-long weekly field study at a full-scale water resource recovery facility

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion