What question did this study set out to answer?

The aim is to improve autonomous driving by addressing challenges in hazard perception and generalization through a novel framework.

April 25, 2026Open Access

Autonomous driving system based on dual process theory and deliberate practice theory

Key Points

The aim is to improve autonomous driving by addressing challenges in hazard perception and generalization through a novel framework.
Developed CogniDrive framework combining dual-process and deliberate practice theories.
Implemented two decision-making modes: InstinctNav for intuitive actions and ReflectPlan for reflective reasoning.
Integrated a vision language model for enhanced environmental understanding and multimodal self-reflection.
Achieved state-of-the-art performance in hazard detection and generalization under corner cases.
Demonstrated improvements in safety, comfort, and energy efficiency through a new evaluation framework.

Abstract

Autonomous driving, despite significant progress, is still not widely applied in open, unconstrained environments, primarily owing to deficiencies in hazard perception, few-shot generalization, corner case generalization, and evaluation metrics, resulting in reliability concerns. To address these challenges, we propose CogniDrive, a framework based on dual-process and deliberate practice theories, leveraging contextual reasoning of the Large Language Model (LLM) to enhance driving systems robustness and generalization. Inspired by dual-process theory, CogniDrive comprises two cognition modes: InstinctNav for rapid, intuitive decision-making and ReflectPlan for reflective reasoning. Enhanced by a thought model and experience embedding for LLM, InstinctNav combines behavioral cloning and retrieval augmented generation to enhance few-shot learning efficiency based on deliberate practice theory. ReflectPlan processes and internalizes reward signals embedded in language tokens within the prompt, derived from a self-reflection mechanism, to enable continuous improvement and generalization. To detect hazards in corner cases precluded by limited training data distribution, a vision language model is integrated for comprehensive environmental understanding through multimodal self-reflection. We further propose an evaluation framework that complements traditional metrics by emphasizing safety, comfort, and energy efficiency, and demonstrate state-of-the-art performance through extensive open-loop and closed-loop experiments.

Mark Helpful

Bookmark

Relay

View Full Paper