Agentic AI systems connect probabilistic reasoning to tools, memory, external data, other agents, and state-changing operations. Their dominant failure mode — prompt injection — is an integrity problem: low-integrity input contaminating high-authority action. Contract-Bound Cognitive Routing (CBCR) treats it as one, modelling agentic execution as information flow over a typed, capability-gated graph mediated by a deterministic reference monitor, the MCP Policy Firewall. The architecture is organised around a single distinction: what can be enforced deterministically, and what cannot. A large class of agentic behaviour can be constrained by construction — the control flow of a plan derived from trusted instructions, and structured data whose value-space is closed and validated — with no reliance on the model's judgement. The boundary is precise, and it is the line most defences blur: schema validation checks shape, not meaning, so type is not trust. Beyond that line — free text, semantically-loaded fields, data-derived parameters, untrusted endpoints — lies a residual that cannot be made deterministic. CBCR's contribution is to confine that residual to a single declared, fail-closed endorsement gate, make it measurable as false-labelling and false-endorsement rates weighted by reachable authority, and extend the discipline across multi-agent delegation through a monotonic non-amplification rule. It adopts the dual-path construction of Willison's dual-LLM pattern (2023) and CaMeL (Debenedetti et al., 2025); it does not solve conservative label propagation through a black-box model, which it states as the load-bearing open problem. CBCR does not make untrusted content safe. It prevents untrusted content from reaching high-authority sinks except through a declared gate, and turns the risk left behind into a measured quantity. Version: v0.7 Language: English
Rob Brennan (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: