Key points are not available for this paper at this time.
Abstract Uncensored harmful and sarcastic memes are shared across the web. We aim to explore the following questions: What kind of memes can be referred to as harmful? Can user engagement metrics in addition to visual and textual features be used to measure the degree of harmfulness of a meme? What are the predominant topics and timeline for the posting of memes? We also aim to analyze and understand the reason for the popularity and share-ability of meme posts. In this paper, we describe and share a manually annotated multitopic and temporal dataset named MemePeril. It contains harmful content and features appearing in Reddit posts. We analyze comments and upvotes to understand the community sentiment of a meme. Evolution analytics on MemePeril dataset provide insight about the dynamics of modern-word memes over various time periods (hourly and weekly). We use samples from our MemePeril dataset to prompt a large language model (LLM). This efficiently tunes LLM for annotation and synthetic data generation tasks. Beyond descriptive analysis, MemePeril is designed to directly benefit the machine learning community. It can be used to train supervised classifiers (e.g. meme virality prediction), benchmark temporal content analysis models, and enable multimodal engagement forecasting using annotated meme trends.
Singh et al. (Wed,) studied this question.