Introduction With the rapid evolution and development of artificial intelligence and intelligent learning, the creation of realistic deepfake multimedia content has become accessible and is raising substantial requirements for digital security and media authenticity. Methods While prevailing methods rely profoundly on deep learning and transformer driven practices, their computational cost, resource usage and sensitivity towards dataset bias prevent real-world usage and deployment. This work studies several practices for perceiving deepfake content in images and videos, analyzing state-of-the-art techniques, Convolutional Neural Network, Xception, ResNet50 and propose hybrid approach (DAAL-NET) with lightweight, Bi-stream artifact-resistant deepfake content detection capabilities to simultaneously learn spatial patterns, cues, and temporal motion inconsistencies. The framework combines three significant novelties: (1) a Local Forensics Encoder with Learnable Frequency Attention mechanism to analyze high-frequency manipulation; (2) a Motion Irregularity Encoder with depth wise temporal convolutions and gated recurrent units to obtain frame-level motion gaps; and (3) a Multi-Stream Interaction Module for bidirectional spatial temporal fusion using cross-attention. A scientifically trained Artifact Confidence Calibration Layer is proposed to improve probability and reliability. Results and discussion Experiments supervised on Datasets of Celeb- DF(v2) and Kaggle exhibit that the proposed hybrid approach enhances macro- F1, calibration error, and temporal robustness compared to baseline models. The proposed model obtains a competitive outcome under constrained computational resources, making it appropriate for forensic applications, real-world media authentication systems, low-power deployments, and scalable deepfake screening pipelines.
Potluri et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: