Los puntos clave no están disponibles para este artículo en este momento.
Deepfake content is created or changed artificially utilizing AI strategies to make it genuine. This research addresses the evolving challenge of detecting deepfake audio content, as recent advancements in deepfake technology have rendered it increasingly challenging to distinguish fabricated content. Leveraging machine and deep learning methodologies, specifically employing Mel-frequency cepstral coefficients (MFCCs) for sound component extraction, we focus on the Genuine-or-Fake dataset — a cutting-edge benchmark dataset generated through a text- to-speech (TTS) model. This dataset is arranged into sub-datasets because of sound length and spot rate. This study reveals that the Convolutional Neural Network (CNN) models exhibit the highest accuracy in identifying deepfake audio within the for-rerec and for-2-sec datasets. Meanwhile, the gradient boosting model performs well in the for-norm dataset. This study illustrates the CNN model's outstanding performance on the for-original dataset, outperforming other cutting-edge models. This study advances the field of deepfake recognition, especially in the areas of audio manipulation, demonstrating the efficacy of CNN models in detecting fake content.
C et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: