What question did this study set out to answer?

The research focuses on evaluating multiple reinforcement learning algorithms for efficient water tank scheduling in variable demand conditions.

April 10, 2026Open Access

Multi-algorithm reinforcement learning framework with feedforward networks for resilient water tank scheduling systems

Key Points

The research focuses on evaluating multiple reinforcement learning algorithms for efficient water tank scheduling in variable demand conditions.
Benchmarking PPO, DQN, and A3C algorithms for water tank scheduling.
Sensitivity analysis of hyperparameters affecting algorithm performance.
Simulation framework incorporates LSTM networks and realistic demand variability.
PPO shows superior performance with 40% fewer pump activations than DQN.
DQN reveals narrow optimal learning rate ranges compared to PPO's broader robustness.
Under extreme demand conditions, PPO has only 12% performance drop compared to DQN's 28%.

Abstract

Efficient and resilient control of water distribution systems (WDS) is critical for sustainable infrastructure management under increasingly uncertain demand conditions. This study presents a comprehensive benchmarking and sensitivity analysis of three reinforcement learning algorithms-Proximal Policy Optimization (PPO), Deep Q-Network (DQN), and Asynchronous Advantage Actor-Critic (A3C) -for water tank scheduling across multi-day planning horizons. Our simulation-based framework incorporates realistic demand variability, extreme operational scenarios, and temporal modeling using LSTM networks to enable robust agent training. Extensive evaluation reveals that PPO achieves superior performance in long-horizon scenarios with up to 40% fewer pump activations and 25% fewer safety violations than DQN, while maintaining competitive performance across shorter horizons. A detailed sensitivity analysis identifies learning rate as the most critical hyperparameter, with DQN showing narrow optimal ranges (1 10^-3) compared to PPO’s broader robustness (1 10^-5 to 3 10^-4). The ablation study demonstrates that while LSTM networks enhance temporal modeling, the simpler DQN-FFN architecture notably outperforms LSTM-augmented counterparts, achieving superior cumulative rewards (−93. 85 vs −134. 15 for PPO-LSTM). Under extreme demand noise up to ±50 units, PPO demonstrates exceptional robustness with only 12% performance degradation compared to 28% for DQN. The study provides practical guidelines for algorithm selection, hyperparameter tuning, and action-space design, establishing a foundation for transparent AI-driven control in complex WDS and directly implicating Industry 4. 0/5. 0 infrastructure modernization.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Hee-Beom Park

Akeem Bayo Kareem

Yusuf Olatunji Kareem

Journals

Complex & Intelligent Systems

Actions

Institutions

Kumoh National Institute of Technology

University of Ilorin

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Multi-algorithm reinforcement learning framework with feedforward networks for resilient water tank scheduling systems

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study