What type of study is this?

September 10, 2025Open Access

Application Analysis of Multimodal Models in Hateful Meme Detection

Key Points

Incorporating cross-attention mechanisms during image-text fusion enhances detection performance.
Optimization techniques like multi-task learning improve the robustness of multimodal models.
Model distillation techniques enable quicker detection of hateful memes with minimal accuracy loss.
Evaluation metrics guide the assessment of model effectiveness in identifying online hate.

Abstract

Hateful memes are internet memes that spread virally by overlaying short text on images, often containing offensive content targeting groups based on gender, religion, race, or other characteristics. Their rapid dissemination and harmful impact make targeted detection critically important. Multimodal models, capable of simultaneously processing images and text, can accurately identify hateful content in memes. This paper analyzes the image-text fusion methods, optimization strategies, and evaluation metrics of multimodal models in hateful meme detection. Results show that incorporating cross-attention mechanisms during the image-text fusion stage effectively captures complementary information between modalities, thereby enhancing downstream task performance. Furthermore, optimization techniques such as multi-task learning and adversarial training can further improve model robustness and detection accuracy. Model distillation techniques enable faster detection with minimal accuracy loss, facilitating the timely identification of newly released hateful memes. In summary, this paper argues that multimodal models hold significant potential for mitigating the spread of online hate and provides theoretical and practical references for related research through an analysis of image-text fusion methods, optimization strategies, and evaluation metrics.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper