Large language models (LLMs) have shown remarkable progress in text-based reasoning, but they continue to struggle with large and complex tables. Token limitations and the loss of structural dependencies make it difficult for a single model to extract evidence, connect relationships, and produce reliable answers. To address this challenge, we introduce Chain of Agents (CoAgt) a multi-agent framework inspired by how human reason with tables, scanning data, comparing values, retaining information, and making decisions. In CoAgt, different agents share the workload: Collectors gather evidence from subtables, the Synthesizer integrates their findings into a coherent answer, and the Refiner ensures clarity and correctness. This division of labor makes the reasoning process both scalable and interpretable, while reducing the risk of errors that often occur when one model handles all steps alone. We evaluate CoAgt on two widely used benchmarks. On WikiTableQuestions (WikiTQ) it achieves 85.4% accuracy, outperforming the strong Chain-of-Table baseline by more than twenty-five percentage points. On Table Fact-Checking (TabFact) it reaches 96.6% accuracy, exceeding previous state-of-the-art results. These findings show that breaking reasoning into specialized agents not only improves performance but also offers a transparent and reliable approach for reasoning over large, complex tables.
Alrayzah et al. (Wed,) studied this question.