Traffic signal control is central to urban mobility because it directly influences congestion, travel time, and emissions. Traditional methods, including fixed-time signal scheduling and actuated control strategies, can handle normal conditions but often fail when traffic changes suddenly. Reinforcement learning (RL) has gained attention as a data-driven approach which adapts policies by interacting with traffic environments. Recent studies have investigated both single-agent and multi-agent formulations, encompassing value-based, policy-based, and actor–critic learning paradigms. This paper reviews how RL has been adopted for traffic signal control, grouping the core approaches and highlighting their evaluation practices. The review shows that most studies still focus on efficiency measures such as delay and queue length, with safety and environmental factors less frequently addressed. Nearly all evaluations are done in simulation, with SUMO and CityFlow as the dominant platforms. Key challenges remain in handling multiple objectives, scaling to large networks, and addressing the challenge of transferring simulation-based results to real-world deployment. By outlining current methods, their strengths and weaknesses, and the gaps that persist, this review points to the directions needed for RL to move from research to practice.
Yusuf et al. (Thu,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: