Los puntos clave no están disponibles para este artículo en este momento.
Cyberbullying is defined as the use of social media platforms to hurt or humiliate people online. The anonymity on these platforms makes it easy to spread hurtful content, sometimes leading victims to self-harm. This highlights the urgent need for efficient methods to identify to prevent cyberbullying. Numerous studies have addressed this issue by focusing on cyberbullying classification, primarily through binary classification using multi-modal data or multi-class approaches targeting either text or image data. While deep-learning advancements have improved cyberbullying identification, a gap remains in the multi-class classification of cyberbullying utilizing multi-modal data such as memes and this research aims to bridge this gap by accurately classifying cyberbullying across multiple data modalities through a hybrid deep-learning model that combines RoBERTa for text extraction and Vision Transformer (ViT) for images extraction as hybrid(RoBERTa+ViT) model using late fusion module process. Two datasets were utilized: a Private-dataset was collected from comments on social media videos and a Public-dataset that was downloaded from existing research. The hybrid model was trained on these two datasets and the model demonstrated notable performance, achieving an an accuracy of 99.24% on the Public-dataset and 96.1% on the Private-dataset, with F1-scores of 0.9924 and 0.9599, respectively.
Tabassum et al. (Thu,) studied this question.