What question did this study set out to answer?

This research aims to create a real-time framework for recognizing driver behavior using multiple data sources to enhance safety in driving systems.

May 31, 2026Open Access

A real-time multimodal framework for driver behavior recognition using transformer and reinforcement learning

Key Points

This research aims to create a real-time framework for recognizing driver behavior using multiple data sources to enhance safety in driving systems.
Developed a multimodal system integrating RGB video, acoustic signals, and geometric keypoints.
Employed lightweight transformer models and reinforcement learning for dynamic path selection.
Evaluated system performance on two standard datasets with real-time operation metrics.
Achieved 99.92% accuracy in driver behavior recognition.
Operated at 41 frames per second with a latency of 24 milliseconds.
Demonstrated effectiveness in enhancing driver assistance systems.

Abstract

Abstract Advanced Driver-Assistance Systems (ADAS) and other in-vehicle technologies often fail to capture critical indicators of unsafe driver behavior, such as fatigue, distraction, or emotional states. Conventional approaches rely on a single data modality (e.g., video, audio, or vehicle sensors), which limits their effectiveness in real-world conditions. Multimodal driver behavior recognition addresses these limitations by integrating complementary data sources to provide a holistic view of the driver’s state. This paper introduces a dependable, real-time multimodal driver behavior recognition system that combines RGB video, acoustic signals, and geometric keypoints to improve transportation safety. The proposed framework employs lightweight transformer models, efficient video token thinning, noise-aware audio processing, and reinforcement learning-based dynamic path selection to balance accuracy and latency. Evaluated on two standard datasets, the system achieves 99.92% accuracy while operating at 41 frames per second with a latency of only 24 milliseconds, demonstrating its suitability for enhancing driver assistance and autonomous driving systems.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

M.A. Sayedelahl

Damanhour University

Ahmed M. Khalil

Mohamed S. Benlamine

Journals

Journal Of Big Data

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

A real-time multimodal framework for driver behavior recognition using transformer and reinforcement learning

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study