We introduce TSCG (Token-Context Semantic Grammar), a deterministic compiler for tool-schema compression in agentic AI systems. Unlike learned compression methods, TSCG provides formal guarantees (≥51% savings on well-formed schemas) with sub-millisecond execution and zero runtime dependencies. Through systematic empirical evaluation across 12+ language models (4B to frontier scale) and 20, 000+ API calls, we establish that JSON schema format itself—not model capacity—is the primary bottleneck for small-model tool-calling. Format translation alone explains the majority of token-cost variance (R²=0. 88 collapses to R²=0. 03 when JSON baseline is removed). Our main contributions: 1. A taxonomy of model-format interactions identifying four distinct classes: format-dominated, compression-friendly, operator-neutral, and combination-fragile. 2. Per-operator decomposition framework demonstrating that operator effects are empirical and per-model—not vendor-architectural. Within-family inversions (GPT-5. 4 vs GPT-5. 2) refute hardcoded vendor classifications. 3. Adaptive empirical detection methodology (180-call sweep, ~1) for production deployment. 4. External validation on Berkeley Function Calling Leaderboard (BFCL) across Claude Sonnet 4, GPT-4o, GPT-5. 2: all three models show accuracy improvements (108-181% Accuracy Retention Rate) alongside 46-72% token savings. We release reference implementation as 4 npm packages (@tscg/core, @tscg/mcp-proxy, @tscg/tool-optimizer, @tscg/openclaw) with comprehensive findings library.
Furkan Sakizli (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: