Signalized intersections force vehicles into frequent idling and repeated acceleration, leading to increased energy consumption. To address this, eco-driving strategies targeting stop-free passage through intersections are considered crucial. This study applies the soft actor–critic (SAC) reinforcement-learning algorithm to develop a control policy that enables vehicle convoys to pass through two consecutive signalized intersections without stopping. By leveraging the maximum-entropy framework, SAC balances exploration and exploitation, allowing for robust and adaptive decision-making under complex traffic dynamics. Simulation experiments compare SAC with natural driving models—(noisy) intelligent driver models (IDM, N-IDM)—and other reinforcement-learning methods—deep deterministic policy gradient (DDPG) and proximal policy optimization (PPO)—under varied traffic conditions. Results demonstrate SAC reduces energy consumption by 21.38% (versus IDM), 21.18% (versus N-IDM), 0.24% (versus DDPG), and 4.03% (versus PPO) while maintaining competitive travel time and speed. These results confirm the effectiveness and robustness of SAC, highlighting its potential for deployment in real-world mixed-traffic environments.
Chen et al. (Wed,) studied this question.