Abstract Advanced Driver-Assistance Systems (ADAS) and other in-vehicle technologies often fail to capture critical indicators of unsafe driver behavior, such as fatigue, distraction, or emotional states. Conventional approaches rely on a single data modality (e.g., video, audio, or vehicle sensors), which limits their effectiveness in real-world conditions. Multimodal driver behavior recognition addresses these limitations by integrating complementary data sources to provide a holistic view of the driver’s state. This paper introduces a dependable, real-time multimodal driver behavior recognition system that combines RGB video, acoustic signals, and geometric keypoints to improve transportation safety. The proposed framework employs lightweight transformer models, efficient video token thinning, noise-aware audio processing, and reinforcement learning-based dynamic path selection to balance accuracy and latency. Evaluated on two standard datasets, the system achieves 99.92% accuracy while operating at 41 frames per second with a latency of only 24 milliseconds, demonstrating its suitability for enhancing driver assistance and autonomous driving systems.
Building similarity graph...
Analyzing shared references across papers
Loading...
M.A. Sayedelahl
Damanhour University
Ahmed M. Khalil
Mohamed S. Benlamine
Journal Of Big Data
Building similarity graph...
Analyzing shared references across papers
Loading...
Sayedelahl et al. (Fri,) studied this question.
synapsesocial.com/papers/6a1bd1745783ba022b6fd129 — DOI: https://doi.org/10.1186/s40537-026-01463-z