Offensive comments and hate speech have become a challenge for content moderation on virtual social networks, and research on automated moderation techniques for Brazilian Portuguese is still limited. In this context, this study aims to contribute to the development of an efficient system for detecting and classifying offensive comments in Brazilian Portuguese using natural language processing and machine learning techniques. The adopted approach explores a novel dataset composed of 4,139 comments in Brazilian Portuguese extracted from YouTube and manually labeled. The goal is to automatically detect and classify offensive comments. Four classical text classification algorithms — Naive Bayes, SVM, Random Forest, and GBM — were compared, applied to the vectorizers CountVectorizer and TF-IDF. The Random Forest model, combined with CountVectorizer, showed the best performance, achieving 86% of accuracy. This result highlights the feasibility of using classical machine learning methods for content moderation in Brazilian Portuguese. This study contributes to the construction and availability of a specialized dataset, promoting advances in the field of automated moderation and providing a useful resource for the development of models focused on the Portuguese language. Thus, the work reinforces the potential of machine learning to promote safer and more inclusive online environments.
Alves et al. (Thu,) studied this question.