What question did this study set out to answer?

The study aims to enhance the quality and effectiveness of adversarial deepfake generation to bypass forensic detection.

May 18, 2026

DCHVF-GAN: Synthesizing Adversarial DeepFakes with High Visual Fidelity by Multimodality Fusion

Key Points

The study aims to enhance the quality and effectiveness of adversarial deepfake generation to bypass forensic detection.
Proposed a spectral fusion approach for synthesizing forgery traces from authentic facial images.
Integrated diffusion-based noise during image preprocessing to embed perturbations.
Conducted extensive experiments to evaluate the anti-forensic performance.
Achieved state-of-the-art anti-forensic performance with preserved high visual fidelity.
Generated adversarial samples remained indistinguishable from real images.

Abstract

Deepfake, an AI-driven face-swapping technique, has been weaponized to spread disinformation. In response, researchers have developed forensic detectors to identify such manipulations. To circumvent these defenses, a growing body of work now focuses on generating adversarial samples—carefully perturbed forgeries designed to deceive detection tools. However, most existing adversarial generation methods sacrifice image quality to achieve undetectability, introducing perceptible artifacts that ironically make them more detectable under human scrutiny. To address this limitation, we propose a novel spectral fusion approach to multimodally synthesize forgery traces from authentic facial images. Unlike traditional noise injection methods, our technique integrates diffusion-based noise during image preprocessing, embedding perturbations in the forward process of a diffusion model. This approach not only deceives forensic detectors more effectively but also preserves high visual fidelity. Through extensive experiments, our method achieves state-of-the-art DeepFake anti-forensic performance while preserving high visual fidelity, ensuring that the adversarial samples remain indistinguishable from real images.

Bookmark

Cite This Study

Ding et al. (Sat,) studied this question.

synapsesocial.com/papers/6a0aad145ba8ef6d83b709e9 https://doi.org/https://doi.org/10.1145/3808697

Bookmark