Large language models (LLMs) have made remarkable progress in question answering, but current approaches in the educational domain often directly predict an answer from multiple choices without thoroughly considering each option. This can lead to suboptimal performance, especially when distractors are plausible. We identify that human examinees commonly use a process of elimination-ruling out incorrect options one by one - to answer such questions, a strategy largely missing from today's LLM-based educational QA. In this paper, we introduce an elimination-based reasoning framework that enables an LLM to simulate human decision-making by sequentially eliminating wrong answer options before selecting a final answer. Our method incorporates structured prompting and intermediate decision steps, using chain-of-thought reasoning to sequentially eliminate incorrect options. Experiments on multiple educational QA benchmarks demonstrate that our approach substantially outperforms standard prompting and chain-of-thought baselines. Notably, it improves accuracy and reliability, closing much of the gap to expert-level performance. We also conduct ablation studies showing the benefit of sequential elimination and analyze the decision-making process of the model. Our findings highlight that incorporating human-like elimination reasoning can significantly enhance LLM performance on complex multiple-choice questions, offering a new avenue for robust educational AI systems.
Building similarity graph...
Analyzing shared references across papers
Loading...
Qianli Zhao
Mei Zhang
Journal of King Saud University - Computer and Information Sciences
Huangshan University
Guangdong Polytechnic of Science and Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhao et al. (Fri,) studied this question.
www.synapsesocial.com/papers/68af5701ad7bf08b1eadd7ea — DOI: https://doi.org/10.1007/s44443-025-00122-2