ABSTRACT This graphical abstract illustrates the proposed framework for daily-scale prediction of effluent total phosphorus (T-P) and sensitivity-based optimization of coagulant dosing in a full-scale DAF system, integrating data preprocessing, machine learning, and operational decision support. Effective phosphorus control in dissolved air flotation (DAF) systems is essential for regulatory compliance in full-scale wastewater treatment plants, yet operational decisions are often constrained to daily time scales due to limited online sensing. This study proposes an interpretable, feature-engineered machine-learning framework for daily prediction of effluent total phosphorus (T-P) and sensitivity-based optimization of coagulant dosing in a full-scale municipal DAF system (410,000 m3/day). Long-term operational, water-quality, and meteorological data (1,096 days) were preprocessed using three-sigma outlier screening and multivariate imputation by chained equations (MICE). Mechanistically informed features capturing influent loading, operational conditions, short-term variability of effluent T-P (1–3 day difference-based features), and seasonal effects were incorporated. Among the evaluated models, Random Forest achieved the best performance (Test R2 = 0.818; RMSE = 0.032 mg/L), corresponding to a prediction error within 20% of the discharge limit (0.2 mg/L). SHAP analysis identified influent T-P, coagulant dosage, and short-term variation as dominant drivers across seasons. A sensitivity-based autoregressive simulation indicated that optimized dosing could reduce coagulant consumption by 32–51%, yielding an estimated annual cost saving of 1.53 billion KRW while improving effluent stability. The proposed framework demonstrates the practical value of daily-scale, interpretable machine learning for data-driven DAF operation.
Park et al. (Wed,) studied this question.