December 23, 2019

Toxicity Detection on Bengali Social Media Comments using Supervised Models

Key Points

Key points are not available for this paper at this time.

Abstract

Social media playing an indispensable role in our daily life providing a public platform to share opinions including threats, spam and vulgar words often referred to as toxic comments. This type of expression depicts the anti-social behavior of the commentators which may hamper the online atmosphere. Filtering such toxic comments by handcrafting rules is cumbersome because they are unstructured and often include misspelled obscene words. Automated machine learning-based models to classify such toxic comments constitute a part of Sentiment Analysis and they are extensively used for the English language; showing promising results than statistical models. Though Bengali is a widely spoken language around the globe, little research works have been done to detect toxic comments in this language. Hence in this scholarly manuscript, we provide a comparative analysis of five supervised learning models (Naive Bayes, Support Vector Machines, Logistic Regression, Convolutional Neural Network, and Long Short Term Memory) to detect toxic Bengali comments from an annotated publicly available dataset. As our research finding, we demonstrate that both the deep learning-based models have outperformed other classifiers by 10% margin where Convolutional Neural Network achieved the highest accuracy of 95.30%.

Bookmark

Toxicity Detection on Bengali Social Media Comments using Supervised Models

Key Points

Abstract

Cite This Study

Also Consider

Also Consider