Large language models make knowledge-graph question answering (KGQA) fluent, but every query pays the full price of an LLM call. We present TACET (Latin: "it is silent"), a self-distilling neuro-symbolic cascade that amortises that cost by progressively migrating a streamed workload off the LLM and onto cheap, checkable tiers — so the expensive teacher progressively falls silent. The cascade has three tiers: a sound forward-chaining Datalog rule engine (Tier 1, which abstains when it cannot prove an answer), a confidence-calibrated ComplEx link predictor verified against a typed ontology (Tier 2), and an LLM teacher (Tier 3). The central mechanism is an online distillation loop: whenever the teacher answers, TACET mines its answers into Datalog-checkable Horn rules and writes the facts back, so the routing distribution drifts toward the cheap tiers as the workload streams. Crucially, a synthesised rule generalises to entities the teacher never saw, which an answer cache cannot do. On a controlled KGQA benchmark (8 seeds), the cascade answers at 98.1% accuracy while reducing blended cost by ~3.9x relative to an LLM-only system under a simulated per-tier cost model. We then confirm this on real data with a real teacher: streaming MetaQA (a 43k-entity movie KG) through a real Grok 4.3 teacher over 3 seeds in a controlled design (Tier-2 disabled and a single teacher answer shared across arms, so accuracy is matched by construction), TACET amortises measured LLM dollars by 2.8x (1-hop) and 5.1x (2-hop) in pooled cost — the per-seed cost ratio spans 1.5–6.3x on 1-hop because only non-empty teacher answers are cached. This saving is delivered by answer reuse (the cascade's caching tier); under the real teacher, rule distillation adds no dollar advantage — full distillation ties the cache on every seed. The distillation-over-caching effect is itself teacher-quality-gated: with an oracle teacher the miner recovers a generalising composition rule and makes 87% fewer teacher calls than a cache (it answers unseen heads, a cache cannot), but under the noisy real LLM the miner recovers no installable rule and the cascade reduces to a cache. Tier-1 answers carry replayable Datalog proof trees, and we prove an ontology-preservation guarantee for the synthesised rules. We release the implementation, the benchmark generator, and the full experiment grid at the linked repository.
Quang Minh Nguyen (Fri,) studied this question.