August 6, 2024Open Access

Leveraging Deep Learning Architectures for Deepfake Audio Analysis

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Deepfake content is created or changed artificially utilizing AI strategies to make it genuine. This research addresses the evolving challenge of detecting deepfake audio content, as recent advancements in deepfake technology have rendered it increasingly challenging to distinguish fabricated content. Leveraging machine and deep learning methodologies, specifically employing Mel-frequency cepstral coefficients (MFCCs) for sound component extraction, we focus on the Genuine-or-Fake dataset — a cutting-edge benchmark dataset generated through a text- to-speech (TTS) model. This dataset is arranged into sub-datasets because of sound length and spot rate. This study reveals that the Convolutional Neural Network (CNN) models exhibit the highest accuracy in identifying deepfake audio within the for-rerec and for-2-sec datasets. Meanwhile, the gradient boosting model performs well in the for-norm dataset. This study illustrates the CNN model's outstanding performance on the for-original dataset, outperforming other cutting-edge models. This study advances the field of deepfake recognition, especially in the areas of audio manipulation, demonstrating the efficacy of CNN models in detecting fake content.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo

Cite This Study

C et al. (Tue,) studied this question.

synapsesocial.com/papers/68e5d47cb6db64358756acc8 https://doi.org/https://doi.org/10.29007/sl8m

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Me gusta

Guardar

Ver artículo completo