Robust autonomous driving under adversarial disturbances remains a fundamental challenge, as conventional deep reinforcement learning (DRL) frameworks exhibit fragile convergence and limited resilience to distributional shifts. This study presents the Adversarially Adaptive Reward-Enhanced Soft Actor – Critic (AARE-SAC), which reconceptualizes SAC as a hierarchically self-regulating control system. Within this paradigm, the temporal-difference error functions as a meta-feedback variable that adaptively modulates learning dynamics, reward valuation, and adversarial response, forming a self-stabilizing loop that internalizes robustness. Experiments in diverse CARLA urban environments demonstrate that AARE-SAC achieves faster and smoother convergence with markedly improved stability and safety. Ablation analyses confirm that its advantage arises from the synergistic coupling between adaptive regulation and adversarial exposure, transforming robustness from a reactive safeguard into an intrinsic property of learning. These findings establish AARE-SAC as a unified framework for achieving stability, efficiency, and resilience in safety-critical autonomous driving.
Yuan et al. (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: