What question did this study set out to answer?

The aim is to develop a robust model for detecting deepfakes using advanced neural network techniques while reducing computational demands.

February 28, 2026Open Access

Hybrid deep feature integration model for robust deepfake detection using transfer-learned neural networks

Key Points

The aim is to develop a robust model for detecting deepfakes using advanced neural network techniques while reducing computational demands.
Analyzed state-of-the-art deepfake detection methods including Convolutional Neural Networks.
Proposed a hybrid approach (DAAL-NET) featuring a Local Forensics Encoder and Motion Irregularity Encoder.
Implemented a Multi-Stream Interaction Module for spatial-temporal data fusion.
Developed an Artifact Confidence Calibration Layer to enhance detection reliability.
Demonstrated improvement in macro-F1 score and calibration error compared to baseline models.
Showed enhanced temporal robustness for deepfake detection.
Achieved competitive performance under limited computational resources suitable for real-world applications.

Abstract

Introduction With the rapid evolution and development of artificial intelligence and intelligent learning, the creation of realistic deepfake multimedia content has become accessible and is raising substantial requirements for digital security and media authenticity. Methods While prevailing methods rely profoundly on deep learning and transformer driven practices, their computational cost, resource usage and sensitivity towards dataset bias prevent real-world usage and deployment. This work studies several practices for perceiving deepfake content in images and videos, analyzing state-of-the-art techniques, Convolutional Neural Network, Xception, ResNet50 and propose hybrid approach (DAAL-NET) with lightweight, Bi-stream artifact-resistant deepfake content detection capabilities to simultaneously learn spatial patterns, cues, and temporal motion inconsistencies. The framework combines three significant novelties: (1) a Local Forensics Encoder with Learnable Frequency Attention mechanism to analyze high-frequency manipulation; (2) a Motion Irregularity Encoder with depth wise temporal convolutions and gated recurrent units to obtain frame-level motion gaps; and (3) a Multi-Stream Interaction Module for bidirectional spatial temporal fusion using cross-attention. A scientifically trained Artifact Confidence Calibration Layer is proposed to improve probability and reliability. Results and discussion Experiments supervised on Datasets of Celeb- DF(v2) and Kaggle exhibit that the proposed hybrid approach enhances macro- F1, calibration error, and temporal robustness compared to baseline models. The proposed model obtains a competitive outcome under constrained computational resources, making it appropriate for forensic applications, real-world media authentication systems, low-power deployments, and scalable deepfake screening pipelines.

Perguntar à IA

Bookmark

View Full Paper

Cite This Study

Potluri et al. (Wed,) studied this question.

synapsesocial.com/papers/69a285aa0a974eb0d3c00b53 https://doi.org/https://doi.org/10.3389/frai.2026.1737761

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Perguntar à IA

Bookmark

View Full Paper