The creation of high-quality multiple-choice questions (MCQs) for language assessment is a labour-intensive task, often requiring careful balancing of linguistic appropriacy, proficiency level, topic coverage, and distractor plausibility. We present a modular, multi-agentic system built using LangChain to generate appropriate MCQs. Each agent in the system is responsible for a distinct task in the question generation pipeline. These tasks range from topic selection and question formation to answer validation, distractor generation, and coverage checks. The system supports flexible substitution of Large Language Models (LLMs), allowing comparative benchmarking across tasks in terms of generation accuracy and latency. Human expert assessment of item quality confirmed that the best-performing configurations yielded scores exceeding 95% in grammatical correctness with satisfactory speed. Our results demonstrate that multi-agent LLM-based architectures can effectively automate complex educational content creation workflows while offering transparency, modularity, and fine-grained controllability. The proposed system offers a reusable design pattern for intelligent educational content generation in broader domains.
Building similarity graph...
Analyzing shared references across papers
Loading...
Peng Zhao
John Blake
Evgeny Pyshkin
Procedia Computer Science
University of Aizu
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhao et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69c0de74fddb9876e79c1409 — DOI: https://doi.org/10.1016/j.procs.2026.01.070