What question did this study set out to answer?

The aim is to enhance the detection of offensive language and hate speech on social media using machine learning techniques.

February 5, 2026Open Access

Offensive Text Detection Using Hybrid Ensemble Text Classifier: An Experimental Study

Key Points

The aim is to enhance the detection of offensive language and hate speech on social media using machine learning techniques.
Analysis of individual machine learning models using classical algorithms.
Evaluation of model performance using metrics such as recall, precision, F1-score, and accuracy.
Development of a novel ensemble strategy combining the top three classifiers.
Individual classifiers achieved accuracies between 85% and 89%.
The ensemble model attained superior accuracy of 90%.
High recall rates were maintained across all detection categories.

Abstract

Social media platforms’ rapid expansion has given users the freedom to voice their thoughts, promoting international cooperation and communication. However, this freedom has also contributed to the spread of abusive language and hate speech, which can have negative impacts on both individuals and communities. Maintaining safe online spaces while upholding the right to free speech has made identifying such content a crucial task. Recent developments in machine learning (ML) provide encouraging solutions by making it possible to automatically identify inappropriate language with a high degree of accuracy. However, because hate speech and offensive language are complex and heavily influenced by context, intent, and cultural variables, it is still difficult to discriminate between them with precision. The dataset used in this work contains labeled tweets categorized into three classes: hate speech, offensive language, and neutral language. Initially, an analysis of individual ML models, including classical algorithms, was performed. The performance of these algorithms was evaluated using recall, precision, F1-score, and accuracy metrics. To further improve detection accuracy, a novel ensemble strategy is proposed that combines the three best-performing models. The ensemble offers a robust solution for offensive language detection by leveraging the strengths of its component models. Experimental evaluations show that the ensemble model achieves superior performance compared to individual classifiers, which had accuracies between 85% and 89%, achieving an accuracy of 90% with high recall across all categories.

Read Full Paperexternally

AI에게 질문

Bookmark

View Full Paper