Despite advances in mathematical reasoning capabilities, Large Language Models (LLMs) still struggle with calculation verification when using established prompting techniques. We present MDToC (Metacognitive Dynamic Tree of Concepts), a three-phase approach that constructs a concept tree, develops accuracy-verified calculations for each concept, and employs majority voting to evaluate competing solutions. Evaluations across CHAMP, MATH, and Game-of-24 benchmarks demonstrate our MDToC's effectiveness, with GPT-4-Turbo achieving 58.1\% on CHAMP, 86.6\% on MATH, and 85\% on Game-of-24 - outperforming GoT by 5\%, 5.4\%, and 4\% on all these tasks, respectively, without hand-engineered hints. MDToC consistently surpasses existing prompting methods across all backbone models, yielding improvements of up to 7.6\% over ToT and 6.2\% over GoT, establishing metacognitive calculation verification as a promising direction for enhanced mathematical reasoning.
Building similarity graph...
Analyzing shared references across papers
Loading...
Tung Duong Ta
Tim Oates
Thien Van Luong
Building similarity graph...
Analyzing shared references across papers
Loading...
Ta et al. (Mon,) studied this question.
synapsesocial.com/papers/6975b20efeba4585c2d6d994 — DOI: https://doi.org/10.13016/m2wtvb-ypf6
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: