Urban traffic congestion is a critical infrastructure challenge facing modern cities as vehicle populations expand and urban density increases. Conventional fixed-timing traffic signal systems are incapable of adapting to the stochastic and dynamic nature of real-world traffic flows, resulting in wasted green-light time, queue buildup, increased vehicle emissions, and emergency response delays. This paper presents TrafficOpt RL, an end-to-end adaptive traffic signal optimization system that applies the Deep Q-Network (DQN) algorithm to learn intelligent signaling policies at urban intersections through iterative simulation experience. The system is built on a custom Gymnasiumcompatible simulation environment modeling a four-way intersection with stochastic Poisson vehicle arrivals. The DQN agent, implemented via the Stable-Baselines3 framework, utilizes experience replay, target network stabilization, and epsilon-greedy exploration to converge on policies minimizing aggregate vehicle waiting times and maximizing intersection throughput. All training metrics and simulation data are persistently stored in a MySQL relational database through automated callback logging, enabling systematic performance analysis. Evaluation via direct comparison against a fixed-timing baseline demonstrates measurable superiority of the reinforcement learning approach across three performance dimensions: average vehicle waiting time, total throughput, and composite efficiency score. Three analytical visualizations are generated to communicate system performance. TrafficOpt RL constitutes a practical proof-of-concept for deep reinforcement learning integration into intelligent transportation systems and smart city infrastructure.
Lavanya et al. (Thu,) studied this question.