Multi-Agent Reinforcement Learning (MARL) provides an effective approach for urban multi-intersection traffic signal control. However, existing methods have faced two fundamental challenges, policy homogenization and inefficient credit assignment. The former led to convergent agent policies that failed to adapt to heterogeneous traffic patterns, while the latter prevented agents from accurately evaluating their individual contributions to system performance. To address these issues, this paper proposes a Multi-Agent Hierarchical Contrastive Learning Traffic Signal Control (MAHCL-TSC) model. The model incorporates an unsupervised contrastive learning module that enhances the discriminative power of state representations, thereby alleviating policy homogenization. Additionally, it designs a hierarchical graph convolutional credit allocation network that leverages road network topology and functional characteristics to enable structure-aware collaborative value estimation, significantly improving the precision of credit assignment. Based on these components, a Contrastive QTRAN with Hierarchical Graph Convolution (CQTRAN-HGC) algorithm is proposed, which jointly optimizes contrastive learning loss and QTRAN constraint loss. Experiments conducted in the Simulation of Urban Mobility (SUMO) simulation environment on 4 × 4 and 6 × 6 synthetic grid networks demonstrate that the proposed model improves traffic signal control performance under the tested structured simulation settings and shows potential scalability as the network size increases.
Yan et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: