This study proposes a health-aware energy management strategy based on the twin delayed deep deterministic policy gradient (TD3) algorithm for hybrid fuel cell/battery-powered ships. Unlike traditional approaches that treat multiple fuel cell stacks as homogeneous units, this strategy innovatively implements differentiated power allocation based on the real-time state of health of each stack. The research first validates the superiority of the TD3 framework over the deep Q-learning framework at the algorithmic level. Further comparative experiments conducted across three scenarios with varying degrees of state of health differences show that, compared to the TD3 baseline strategy employing average power allocation, the health-aware differentiated TD3 strategy significantly reduces the total voyage cost of the system, with the cost-saving effect becoming more pronounced as the state of health disparity between stacks increases. Additionally, by incorporating rule-based constraints, the convergence speed of the TD3 algorithm is effectively enhanced, improving its feasibility for real-time control. Tests under dynamic and fluctuating load conditions further confirm the strategy’s effectiveness and applicability. In summary, the health-aware TD3 strategy proposed in this study not only provides an efficient and reliable energy management solution for hybrid-powered ships but also promotes the application of machine learning in the field of ship energy management.
Zhu et al. (Sat,) studied this question.