What question did this study set out to answer?

The aim is to develop effective methods for detecting hate speech in Indonesian memes using a new dataset.

April 5, 2026Open Access

Decoding hate in memes: multimodal and multitask approaches for low-resource Indonesian social media

Key Points

The aim is to develop effective methods for detecting hate speech in Indonesian memes using a new dataset.
Introduced the Indonesian Multimodal Meme Dataset (INDOMEME) with expert annotations.
Collected and annotated 5,023 memes based on hatefulness, appropriateness, and topical focus.
Evaluated unimodal and multimodal models, comparing performance on hate speech detection and appropriateness classification.
Implemented multitask learning through a dual-head architecture for improved detection accuracy.
Multimodal models significantly outperformed unimodal baselines in detecting hate speech.
The best model achieved a macro-F1 score of 0.820 for hate speech and 0.809 for appropriateness.
GPT-4o performed well in zero-shot settings with a macro-F1 of 0.772 for appropriateness, but struggled with hatefulness classification.
Montitask learning improved performance across text-only models, indicating effective dual-task approaches.

Abstract

Memes have become a dominant medium of online expression, blending humor, satire, and cultural commentary through visual and textual elements. While often used for entertainment and community building, memes can propagate hate speech in subtle and implicit ways, making automatic detection particularly challenging. This study introduces the Indonesian Multimodal Meme Dataset (INDOMEME), the first expert-annotated multimodal dataset for hateful meme detection in the Indonesian language. The dataset contains 5,023 memes collected from Facebook and annotated under three complementary schemes: hatefulness, appropriateness, and topical focus. Each meme is further enriched with optical character recognition (OCR) text and machine-generated captions, providing a comprehensive resource for multimodal analysis. Using this dataset, the study conducts extensive experiments addressing four research questions. First, unimodal models (text-only and image-only) are benchmarked against multimodal fusion models, showing that multimodal approaches outperform unimodal baselines; the best multimodal model (IndoBERTweet + Visual Transformers (ViT)) achieves a macro-F1 of 0.820 on hate speech detection and 0.809 on appropriateness classification. Second, several state-of-the-art multimodal large language models (MLLMs), including GPT-4o, Gemini 2.5 Flash, and Gemma3 27B, are evaluated in zero-shot settings, with GPT-4o reaching a macro-F1 of 0.772 for appropriateness detection, although MLLMs remain less effective for hatefulness classification compared to supervised approaches. Finally, multitask learning is explored by jointly modeling appropriateness and hatefulness using a dual-head architecture, demonstrating consistent performance gains across text-only models. These findings underscore the benefit of multimodal resources and multitask architectures in advancing Indonesian meme hate speech detection.

Bookmark

View Full Paper

Bookmark

View Full Paper

Decoding hate in memes: multimodal and multitask approaches for low-resource Indonesian social media

Key Points

Abstract

Cite This Study