February 15, 2022

Safe Model-Based Off-Policy Reinforcement Learning for Eco-Driving in Connected and Automated Hybrid Electric Vehicles

Key Points

Key points are not available for this paper at this time.

Abstract

Deep Reinforcement Learning (DRL) has recently been applied to eco-driving to intelligently reduce fuel consumption and travel time. While previous studies synthesize simulators and model-free DRL (MFDRL), this work proposes a Safe Off-policy Model-Based Reinforcement Learning (SMORL) algorithm for eco-driving. SMORL integrates three key components, namely a computationally efficient model-based trajectory optimizer, a value function learned off-policy and a learned safe set. The advantages over the existing literature are three-fold. First, the combination of off-policy learning and the use of a physics-based model improves the sample efficiency. Second, the training does not require any extrinsic rewarding mechanism for constraint satisfaction. Third, the feasibility of trajectory is guaranteed by using a safe set approximated by deep generative models. The performance of SMORL is benchmarked over 100 trips against a baseline controller representing human drivers, a non-learning-based optimal controller, a previously designed MFDRL strategy, and the wait-and-see optimal solution. In simulation, SMORL reduces the fuel consumption by more than 21% while keeping the average speed comparable while compared to the baseline controller and demonstrates a better fuel economy while driving faster compared to the MFDRL agent and the non-learning-based optimal controller.

Safe Model-Based Off-Policy Reinforcement Learning for Eco-Driving in Connected and Automated Hybrid Electric Vehicles

Key Points

Abstract

Cite This Study