December 5, 2025Open Access

CoAgt: unleashing the reasoning capabilities of large language models on tabular data with a chain of agents

Key Points

CoAgt improves reasoning performance in tabular data through a multi-agent framework, achieving notable accuracy enhancements.
It outperforms prior benchmarks, achieving 85.4% accuracy on WikiTableQuestions and 96.6% on TabFact, reflecting substantial advancements in reasoning capabilities.
The multi-agent division of labor optimizes tasks such as evidence collection and analysis, ensuring clarity and correctness in answers.
This approach not only boosts performance but also provides a reliable, transparent method for dealing with complex data structures.

Abstract

Large language models (LLMs) have shown remarkable progress in text-based reasoning, but they continue to struggle with large and complex tables. Token limitations and the loss of structural dependencies make it difficult for a single model to extract evidence, connect relationships, and produce reliable answers. To address this challenge, we introduce Chain of Agents (CoAgt) a multi-agent framework inspired by how human reason with tables, scanning data, comparing values, retaining information, and making decisions. In CoAgt, different agents share the workload: Collectors gather evidence from subtables, the Synthesizer integrates their findings into a coherent answer, and the Refiner ensures clarity and correctness. This division of labor makes the reasoning process both scalable and interpretable, while reducing the risk of errors that often occur when one model handles all steps alone. We evaluate CoAgt on two widely used benchmarks. On WikiTableQuestions (WikiTQ) it achieves 85.4% accuracy, outperforming the strong Chain-of-Table baseline by more than twenty-five percentage points. On Table Fact-Checking (TabFact) it reaches 96.6% accuracy, exceeding previous state-of-the-art results. These findings show that breaking reasoning into specialized agents not only improves performance but also offers a transparent and reliable approach for reasoning over large, complex tables.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper