Large language models (LLMs) excel at many QA tasks but still struggle with multiple-choice question answering (MCQA), especially under strong distractors. Humans often solve such questions by eliminating implausible options and verifying the remaining candidates. We propose a model-agnostic structured elimination framework that unifies stepwise elimination, answer verification, and self-consistency. Instantiated with LLaMA-3 (8B) as the primary backbone, the model performs multi-round option elimination, optionally verifies eliminations via internal checks or lightweight evidence retrieval (e.g., Wikipedia), and aggregates multiple sampled elimination chains for robust decisions. We introduce SportsMCQ-5k, a 5,000-question sports training MCQA benchmark, and evaluate on it alongside CommonsenseQA, Social IQa, and MedMCQA. Across datasets, our method consistently improves accuracy over strong 7B–9B open-source baselines by 4–7 points, while ablations confirm the contributions of verification and self-consistency. The proposed framework enhances robustness and interpretability for educational assessment, including sports training and other discipline-specific testing.
Building similarity graph...
Analyzing shared references across papers
Loading...
Ye Teng
Cheng Wang
Journal of King Saud University - Computer and Information Sciences
East China University of Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Teng et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69be38446e48c4981c6789c7 — DOI: https://doi.org/10.1007/s44443-026-00643-4
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: