The proliferation of fake news, particularly in sensitive domains like religious texts, necessitates robust authenticity verification methods. This study addresses the growing challenge of authenticating Hadith, where traditional methods relying on the analysis of the chain of narrators (Isnad) and the content (Matn) are increasingly strained by the sheer volume in circulation. To combat this issue, machine learning (ML) and natural language processing (NLP) techniques, specifically through transfer learning, are explored to automate Hadith classification into Genuine and Fake categories. This study utilizes an imbalanced dataset of 8544 Hadiths, with 7008 authentic and 1536 fake Hadiths, to systematically investigate the collective impact of both linguistic and contextual features, particularly the chain of narrators (Isnad), on Hadith authentication. For the first time in this specialized domain, state-of-the-art pre-trained language models (PLMs) such as Multilingual BERT (mBERT), CamelBERT, and AraBERT are evaluated alongside classical algorithms like logistic regression (LR) and support vector machine (SVM) for Hadith authentication. Our best-performing model, AraBERT, achieved a 99.94% F1score when including the chain of narrators, demonstrating the profound effectiveness of contextual elements (Isnad) in significantly improving accuracy, providing novel insights into the indispensable role of computational methods in Hadith authentication and reinforcing traditional scholarly emphasis. This research represents a significant advancement in combating misinformation in this important field.
Alghamdi et al. (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: