What type of study is this?

This is a Literature Review study.

September 12, 2025Open Access

A Comprehensive Review on Hate Speech Detection using BERT and Transformer-based Architectures

Key Points

BERT-based models demonstrate enhanced detection capabilities compared to traditional methods, improving classification accuracy.
The review discusses important adaptations like RoBERTa and DistilBERT, showcasing their impact on performance metrics.
Challenges such as dataset bias and class imbalance persist in hate speech detection, indicating areas for further research.
Emerging directions include integrating multimodal data and addressing fairness and explainability in automated systems.

Abstract

Abstract - Hate speech detection has become a pressing challenge in the era of digital communication, as the rapid proliferation of offensive, abusive, and discriminatory content on social media platforms poses serious threats to individual well- being and societal harmony. Identifying such content is inherently complex due to linguistic ambiguity, sarcasm, implicit hate, cultural context, and multilingual variations, which often lead to misclassification by conventional systems. Earlier approaches based on machine learning with hand-crafted features and statistical models, or even deep learning techniques such as CNNs and LSTMs with static embeddings, have achieved limited success in handling these challenges. The emergence of Bidirectional Encoder Representations from Transformers (BERT) and its numerous variants has marked a paradigm shift in hate speech detection by providing deep contextualized word embeddings, bidirectional sequence modeling, and the ability to transfer knowledge across domains and languages. This review presents a comprehensive examination of BERT-based methods for hate speech and offensive language detection, analyzing their architectures, fine-tuning strategies, and adaptations such as RoBERTa, DistilBERT, ALBERT, XLM-R, and domain-specific models like HateBERT. A detailed discussion of benchmark datasets, evaluation metrics, and comparative performance across languages and platforms is provided, offering insights into the strengths and weaknesses of these models relative to traditional baselines. Moreover, the review identifies persistent challenges such as class imbalance, annotation subjectivity, dataset bias, low-resource languages, and the urgent need for explainability and fairness in automated moderation systems. Finally, it highlights emerging research directions, including multimodal hate speech detection (text, images, and video), cross-lingual and code-switched analysis, integration of large language models (LLMs) for contextual re-ranking, and bias mitigation strategies to ensure equitable performance. By consolidating recent advancements and open challenges, this study aims to serve as a foundational reference for researchers, practitioners, and policymakers working toward the development of robust, fair, and scalable hate speech detection systems powered by BERT and transformer-based architectures. Keywords: Hate Speech Detection, BERT, Transformers, Offensive Language, Deep Learning, NLP, Multilingual Detection, Fairness, Explainable AI

Read Full Paperexternally

Perguntar à IA

Bookmark

View Full Paper