October 31, 2021

Deepfake Detection Based on Incompatibility Between Multiple Modes

Key Points

Key points are not available for this paper at this time.

Abstract

We propose a multi-modal detection for deepfake videos, called the Incompatibility Between Multiple Modes (IBMM) detection. The detection algorithm can detect whether the video is real or fake, and may be embedded in the monitoring equipment in the future. The model adopts EfficientNet and simple 3D-CNN, and it identifies deepfake videos through three modes. In the facial motion mode and lip motion mode, we use the EfficientNet for feature learning. This network uses a series of fixed scaling coefficients to scale the dimensions of the network uniformly and achieves good results in learning image features. In the audio mode, we adopt 3D-CNN network to train the hot coding diagram of audio data. Besides, for a single mode, we use the cross-entropy loss to calculate the irrationality of the mode. For different modes, the contrastive loss is used to calculate the incongruity between the modes, such as incompatibility between lips and voice. Experimental results show that, compared with other existing fake detection methods, the method presented in this paper has higher accuracy (95.87%) on DFDC datasets. And compared with the existing methods, the accuracy increases by 5.21%.

Perguntar à IA

Bookmark

Cite This Study

Zhang et al. (Sun,) studied this question.

synapsesocial.com/papers/6a151425a05db7ab4b62e140 https://doi.org/https://doi.org/10.1109/icites53477.2021.9637096

Perguntar à IA

Bookmark