Mandatory lane changes (MLCs) pose significant challenges to trajectory planning at intersections, where vehicles are required to change lanes mid-block to reach designated turn lanes before the stop bar. MLCs often generate shockwaves that induce increased vehicle delay and fuel consumption, and the presence of human-driven vehicles in mixed traffic further exacerbates this issue. To address these challenges, this study formulates the joint longitudinal-lateral trajectory planning problem in mixed traffic as a multi-agent reinforcement learning (MARL) task. We propose SS-MA-PPO, a Simulation-Supervised Multi-Agent Proximal Policy Optimization framework, which guides connected and autonomous vehicles (CAVs) in both acceleration and lane-change decisions. A Simulation-Guided Supervisory Module (SGSM) performs offline trajectory rollouts of human-driver models to assess feasibility and safety, and arbitrates online between rule-based and learned policies. The information of surrounding vehicles is incorporated in the observation to achieve vehicle cooperation, and a transfer learning mechanism is designed to accelerate training. Experiments using a real-world dataset from Langfang, China demonstrate that SS-MA-PPO outperforms both conventional and MARL baselines across various evaluation metrics. Ablation experiments verify the substantial effectiveness of the proposed SGSM module, vehicle cooperation, and transfer learning, achieving enhanced performance and faster training convergence. The source code is available at: https://github.com/Xingwei-Jiang/SS-MA-PPO.
Jiang et al. (Sun,) studied this question.