What does this research mean for the field?

The proposed delay-aware cross-modal knowledge distillation method significantly improves driver vigilance estimation accuracy and temporal alignment compared to existing methods, while ensuring real-time performance suitable for edge deployment. Novelty: ClaimNovelty.NOVEL_FINDING. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

The aim is to create an efficient method for estimating driver vigilance using multiple physiological signals while considering practical deployment challenges.

March 5, 2026

Delay-Aware Cross-Modal Knowledge Distillation for Driver Vigilance Estimation: Toward Practical Edge Deployment

Key Points

The aim is to create an efficient method for estimating driver vigilance using multiple physiological signals while considering practical deployment challenges.
Utilized EEG signals to train a teacher model for vigilance estimation.
Employed an information-theoretic criterion to select appropriate physiological signals for knowledge distillation.
Proposed a delay-aware soft alignment mechanism to address temporal misalignment of signals.
Developed an objective function to support effective training of the knowledge distillation method.
The proposed method outperforms existing models in estimation accuracy.
Demonstrated improved temporal alignment of physiological signals.
Maintained real-time performance suitable for edge deployment.

Abstract

Efficient vigilance estimation in driving scenarios requires a balance between model performance and practicality. Electroencephalography (EEG), which can directly reflect brain activity, is widely used for vigilance estimation, but its acquisition process is complicated and difficult to apply to real-world driving. In contrast, physiological signals such as electrooculogram, electrodermal activity, and photoplethysmography have more advantages for practical deployment, but the information they provide is relatively limited. To address the above issues, we propose a delay-aware cross-modal knowledge distillation method. EEG signals are only used to train the teacher model. Then, an information-theoretic criterion based on mutual information and response delay is employed to determine which physiological signals are suitable as student modality for knowledge distillation from the EEG-based teacher model. On this basis, considering the inherent temporal differences caused by different physiological signals with varying sensitivities to cognitive responses, a delay-aware soft alignment mechanism (DASA) is proposed, which handles the temporal misalignment of different physiological signals and captures the asynchronous dynamics of the EEG and other physiological signals through the introduction of learnable delay and spread parameters at the patch level, to achieve soft, temporally-aligned supervision from the teacher to the student model. Finally, an objective function incorporating cross-modal consistency, patch level alignment, and smooth regularization is designed to support the effective training of the proposed cross-modal knowledge distillation method. Extensive experiments on MMV and SEED-VIG datasets validates that the proposed method outperforms existing methods in terms of estimation accuracy and temporal alignment while maintaining the real-time performance required for edge deployment.

AIに質問

Bookmark

AIに質問

Bookmark

Delay-Aware Cross-Modal Knowledge Distillation for Driver Vigilance Estimation: Toward Practical Edge Deployment

Key Points

Abstract

Cite This Study