What question did this study set out to answer?

The aim is to develop an effective method for identifying offensive language in social media content using machine learning.

April 3, 2026Open Access

A Machine Learning Method for Identifying Offensive Language in Text from Social Media

Key Points

The aim is to develop an effective method for identifying offensive language in social media content using machine learning.
Textual data preprocessing including cleaning, tokenization, and stop-word removal
Feature extraction through TF-IDF vectorisation
Training a supervised machine learning model
Performance evaluation using accuracy, precision, recall, and F1-score
The model accurately detects offensive language with high precision and recall
Success in automated content moderation for safer online interactions
Potential for expansion to handle multilingual data and real-time scenarios

Abstract

The tremendous rise in user-generated content brought about by the quick expansion of social media platforms has raised serious concerns about harsh and abusive language. Detecting such hazardous content is crucial to preserving a healthy online environment since it can have detrimental effects on people and online communities. This work suggests an intelligent approach that uses machine learning and natural language processing (NLP) to identify harmful words in social media text. In the suggested method, textual data is preprocessed by cleaning, tokenisation, and stop-word removal. Then, features are extracted using methods like TF-IDF vectorisation. After that, a supervised machine learning model is trained to distinguish between offensive and non-offensive language. Standard measures including accuracy, precision, recall, and F1-score are used to assess the model's performance. According on experimental findings, the suggested approach successfully and highly accurately detects offensive content, supporting automated content moderation. In order to promote safer and more responsible usage of social media platforms, the system can be further expanded to handle multilingual data and real-time applications.

KI fragen

Bookmark

View Full Paper