March 3, 2026Open Access

LLM-guided deep reinforcement learning with contrastive safety regularization for autonomous driving

Key Points

LSA-DQN reduces the collision rate to 0.9%, showcasing its improved safety performance compared to baseline algorithms.
Experiments reveal that LSA-DQN maintains high traffic efficiency even with enhanced safety measures.
Utilization of a multi-head attention network allows for better comprehension of complex traffic scenarios.
The framework implements a hybrid safety regularization method that combines Conservative Q-Learning with Margin-based Contrastive Penalization.

Abstract

Deep Reinforcement Learning has shown immense potential in autonomous driving decision-making; however, its application in safety–critical scenarios such as highways still faces significant challenges in data efficiency and safety performance. To address this, this paper proposes an LLM-guided Safety-Aware DQN (LSA-DQN) framework. This framework first utilizes a multi-head attention network to enhance the comprehension of complex traffic scenarios. Building on this, we design a method combining physics-based pre-screening and Large Language Model (LLM) arbitration to accurately classify experience data. Furthermore, we devise a hybrid safety regularization method that integrates Conservative Q-Learning with a Margin-based Contrastive Penalization (MCP) to learn explicit safety boundaries. Experimental results demonstrate that, compared to baseline algorithms, LSA-DQN reduces the collision rate to 0.9% while maintaining high traffic efficiency, proving its high robustness and reliability in complex and dynamic highway environments.

LLM-guided deep reinforcement learning with contrastive safety regularization for autonomous driving

Key Points

Abstract

Cite This Study

Also Consider

Also Consider