Abstract Simultaneously ensuring operational efficiency and safety of energy systems remains a critical challenge for fuel cell vehicle energy management. Mainstream deep reinforcement learning (DRL) approaches often inadequately address explicit safety constraints, especially concerning lithium-ion battery (LIB) thermal management. This study proposes a safety-guided DRL framework introducing an independent safety-guided network to explicitly and reliably enforce safety constraints. By decoupling safety assurance from objective optimization, our architecture overcomes the mutual interference and reward-tuning difficulties inherent in existing reward-penalty methods. Validated on a fuel cell bus platform, our method outperforms state-of-the-art baselines, improving fuel economy by 8.36% and LIB thermal safety by 10.14% under full-load conditions. Notably, it maintains a zero unsafe duration ratio across real-world scenarios and reduces violation severity by up to 21.88% under extreme thermal conditions. These results demonstrate the proposed method’s robust safety assurance and generalization capability, positioning it as a practical solution for intelligent vehicle energy management.
Jia et al. (Mon,) studied this question.