What question did this study set out to answer?

This study investigates the trade-off between tool definitions and context window in agentic RAG systems.

May 26, 2026Open Access

Tool-Schema Compression Enables Agentic RAG Under Constrained Context Budgets

Key Points

This study investigates the trade-off between tool definitions and context window in agentic RAG systems.
Systematic evaluation of 14 models from 1.5B to 32B with 6,566 controlled API calls.
Utilized three context budgets (8K, 16K, 32K) and observed 28 tool definitions.
Introduced TSCG conservative-profile compression to assess schema token savings.
At 8K tokens, uncompressed schemas yield 2.6% average exact-match; compression improves it by +20.5 pp on average for eight models.
At 32K tokens, small differences (Δ ≤ 1 pp) indicate results are budget-driven.
External validation on HotpotQA shows +48 pp exact-match improvement under overflow scenario.

Abstract

Agentic RAG systems that equip language models with dozens to hundreds of tool definitions face a critical resource conflict: tool schemas consume the same context window needed for retrieval-augmented generation. We present the first systematic study of this tool–context trade-off, evaluating 14 models spanning 1.5B–32B local models plus one frontier API model across 6,566 controlled API calls at three context budgets (8K, 16K, 32K) with 28 tool definitions. Applying TSCG conservative-profile compression (44–50% schema token savings), we observe a binary enablement effect: at 8K tokens, JSON-schema tool definitions overflow the context window entirely, yielding near-zero EM (2.6% average), while compressed schemas restore RAG functionality with +20.5 pp average exact-match lift across all eight models (+24.7 pp among the six exhibiting full enablement). At 32K—where both formats fit—four of five tested models show Δ ≤ 1 pp, confirming the effect is purely budget-driven. External validation on HotpotQA (50 multi-hop questions) shows +48 pp EM under the same overflow scenario. Frontier scaling tests demonstrate that JSON schemas overflow at ~494 tools while compressed schemas remain operational beyond 800 tools. Our results establish tool-schema compression as a necessary infrastructure layer for agentic RAG in constrained-context deployments. All code, data, and checkpoints are publicly available.

Tool-Schema Compression Enables Agentic RAG Under Constrained Context Budgets

Key Points

Abstract

Cite This Study

Also Consider

Also Consider