What question did this study set out to answer?

The aim is to develop an efficient content moderation system for online discussion platforms to address harmful content.

March 25, 2026Open Access

Design and Evaluation of a Scalable, Two-Tier LLM Content Moderation Pipeline for Real-Time Discussion Platforms

Key Points

The aim is to develop an efficient content moderation system for online discussion platforms to address harmful content.
Designed a two-tier hybrid moderation pipeline with distinct operational tiers.
Implemented a synchronous rule-based engine for immediate violation detection.
Used an asynchronous LLM-based analysis for complex context processing.
Incorporated Kafka Streams for optimized batch processing.
Achieved a Macro-F1 score of 0.8945 on a dataset of 1000 entries.
The rule-based engine demonstrated an internal latency of 8ms (p50).

Abstract

On online platforms, harmful contents such as hate speech, spam, and personal information leakage are key factors that negatively affect user experience. This study proposes a Two-Tier Hybrid Moderation Pipeline based on Hexagonal Architecture to solve the trade-off between accuracy and latency. The system combines a synchronous rule-based engine (Tier 1) for immediate violation processing and an asynchronous LLM-based pipeline (Tier 2) for complex contextual analysis, with Kafka Streams-based batch processing to optimize cost and throughput. The proposed pipeline achieved a Macro-F1 score of 0.8945 on an evaluation dataset of N=1000, with the rule-based engine showing an internal latency of 8ms (p50). Acknowledgment: Language editing was assisted by AI-based tools (ChatGPT, Claude).

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Im Sin Cheong (Mon,) studied this question.

synapsesocial.com/papers/69c37b93b34aaaeb1a67e0dd https://doi.org/https://doi.org/10.5281/zenodo.19186780

Bookmark

View Full Paper