The classification of dangerous driving behaviors has become an increasingly vital task in the development of intelligent transportation systems and autonomous driving technologies. With the growing prevalence of accidents caused by distracted, aggressive, or fatigued driving, there is a pressing need for accurate, interpretable, and real-time behavior recognition models. In this paper, we propose a novel neural network-based framework that combines the strengths of a Behavioral Latent Transformer Network (BLTN) and a Causal Risk-Weighted Propagation (CRWP) mechanism. BLTN models driver behavior as a sequence of latent states using multimodal sensor inputs — such as vehicle dynamics, visual context, and environmental data — capturing both spatial correlations and temporal dynamics through variational inference and attention mechanisms. On top of this, CRWP builds a dynamic causal graph over behavior sequences, enabling the propagation and aggregation of risk signals across time to identify critical behavior transitions and high-risk patterns. This layered architecture improves not only detection accuracy but also the interpretability and generalization across diverse driving scenarios. Experimental evaluations on four widely used datasets — Drive and Act, SHRP 2 NDS, Driver Attention Dataset, and Brain4Cars — show substantial performance gains compared to leading baselines across classification accuracy, recall, F1-score, and AUC metrics. Ablation studies further confirm the necessity of our model’s components, such as modality-aware fusion and anticipation modules. Despite challenges in real-time deployment due to sensor noise and computational complexity, our framework establishes a scalable and extensible foundation for future intelligent driver monitoring and behavior prediction systems.
Cui et al. (Mon,) studied this question.