What question did this study set out to answer?

The aim is to develop a multimodal framework for deepfake detection that can operate effectively on resource-constrained edge devices.

June 12, 2026Open Access

Sensors-Driven Multimodal Deepfake Detection: A Cross-Attention Fusion Approach with Adaptive Modality Gating

Key Points

The aim is to develop a multimodal framework for deepfake detection that can operate effectively on resource-constrained edge devices.
Proposed a cross-modal attention fusion mechanism with adaptive gating.
Employed enhanced Res2Net for audio and temporal 3D CNN with SE attention for video.
Conducted evaluation on a dataset of 7314 samples (5472 audio + 1842 video).
The fusion model achieved 96.7% accuracy and 96.6% F1-score.
Performance under FGSM attack reached 92.3% accuracy.
Cross-dataset evaluation on FakeAVCeleb yielded 92.3% overall accuracy.

Abstract

Deepfakes threaten sensor-based authentication systems, including biometric sensors, surveillance cameras, and IoT edge devices. Unimodal detectors remain vulnerable to modality-specific attacks. We propose a multimodal deepfake detection framework optimized for resource-constrained edge devices, featuring a novel cross-modal attention fusion mechanism with adaptive gating. The architecture combines enhanced Res2Net for audio, temporal 3D CNN with SE attention for video, and bidirectional cross-modal attention with quality-based gates. On our benchmark (5472 audio + 1842 video samples), the fusion model achieves 96.7% accuracy, 96.6% F1-score, 0.988 AUC-ROC, and 3.3% EER. Adversarial testing shows 92.3% accuracy under the Fast Gradient Sign Method (FGSM) attack. The model has a 30.3 MB footprint and runs at 20 FPS on edge hardware. Modality contribution analysis reveals adaptive weighting (72% audio for TTS forgery, 78% video for lip-synced attacks). Cross-dataset evaluation on FakeAVCeleb achieves 92.3% overall accuracy, confirming generalization.

Read Full Paperexternally

KI fragen

Bookmark

View Full Paper

Cite This Study

Waseem et al. (Wed,) studied this question.

synapsesocial.com/papers/6a2ba2f68101cf8926f01c11 https://doi.org/https://doi.org/10.3390/s26123695

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

KI fragen

Bookmark

View Full Paper