Ensuring safety in interactive autonomous driving remains a core challenge for reinforcement learning (RL), since agents must act under uncertainty and rare but critical events (e.g., collisions) while respecting traffic rules such as yielding and right-of-way. To address these challenges, we propose DiStaK, a distributional Stackelberg RL framework that models driving interactions as a bilevel leader–follower game. A practical discrete instantiation, DiStaK-C51, augments the safety layer with a C51-based cost head to estimate the full distribution of safety costs and constructs chance-constrained admissible action sets via cumulative distribution (CDF) thresholding. To improve efficiency, DiStaK-C51 replaces exhaustive joint-action Stackelberg enumeration with a retriever–refiner Top-K’/Top-K selection rule: a lightweight retriever produces a small candidate list, chance-constraint screening filters unsafe actions, and a final Top-K shortlist supports critic-based refinement. The follower selects a risk-aware best response using Q2 - λ2C2 with an adaptive dual update on λ2, and leader actions can be screened based on the induced interaction outcome, with a relaxation fallback to avoid deadlock when estimated safe sets are empty. We evaluate DiStaK-C51 on standard two-vehicle merge and roundabout benchmarks, where it achieves substantially improved safety metrics while maintaining strong task performance and stable learning dynamics. We also provide theoretical analysis showing that the (fallback-augmented) screened safe Stackelberg Bellman operator is a contraction and that Top-K shortlisting and distributional projection yield an explicit ϵ-neighborhood bound. Finally, we outline a practical extension to multi-vehicle traffic via rule-based role assignment and a horizontal two-level Stackelberg expansion, while comprehensive multi-vehicle evaluation is left for future work.
Qu et al. (Thu,) studied this question.