What question did this study set out to answer?

This work addresses the trust barrier hindering the deployment of AI agents in enterprise systems by proposing the Network Intent Layer as a governance framework.

July 3, 2026Open Access

Unexpressible, Not Filtered: A Structural Framework for Governing AI-Agent Actions — the Network Intent Layer

Key Points

This work addresses the trust barrier hindering the deployment of AI agents in enterprise systems by proposing the Network Intent Layer as a governance framework.
Propose a structural framework for AI governance called the Network Intent Layer (NIL).
Conducted a controlled A/B evaluation on InjecAgent with 2,108 cases to test unauthorized write prevention.
Performed an edge-level assessment on live adapters to measure refusal of undeclared actions.
Achieved zero unauthorized writes in the A/B evaluation, with authorized calls passing without refusal.
Unveiled a 100% success rate in refusing undeclared verbs and targets with no backend effects observed in the secondary evaluation.

Abstract

Large language model (LLM) agents are moving from generating text to taking actions on production systems: issuing refunds, updating records, sending messages. Independent enterprise data now identifies the resulting trust gap, not model capability, as the dominant barrier to deployment: Stanford's 2026 AI Index reports security and risk as the top blocker to scaling agentic AI at 62%, a 24-point margin over the next factor, even as organizational AI adoption reaches 88% and actual agent deployment remains in single digits. Prevailing defences are behavioural: the agent authors an action and a probabilistic filter attempts to catch unsafe ones after the fact, a probabilistic check over a probabilistic policy, which admits a nonzero failure rate by construction. We propose a structural framework. The Network Intent Layer (NIL) is a neutral wire contract under which an agent never issues an action; it can only propose intent against operations a backend has explicitly declared, and every write passes a deterministic propose → approve → commit → rollback lifecycle. An action a backend never declared is unexpressible, not merely blocked. This severs deciding from doing: a poisoned reasoning loop still cannot author a write, and the security perimeter collapses from every reasoning step (O(n)) to one intent-to-effect boundary (O(1)), independent of the model. We give the framework in full: four structural guarantees, a statically-validated multi-step plan language, a human-approval gate over an auditable lifecycle, honest multi-step reversibility, and wire-level robustness (typed refusals, deterministic idempotency, circuit-breaking). We then give two evaluations. A controlled A/B on InjecAgent (2,108 indirect prompt-injection cases, two models, the base attack setting) routes the same tool calls through NIL and admits zero unauthorized writes at the gate while authorized calls pass unrefused; the rate does not move with the model because it is fixed by the construction, not estimated from the sample. A second, edge-level evaluation measures the structural claim directly on a live adapter: undeclared verbs and targets, including a generic-CRUD target axis that a benchmark over verb names alone cannot see, are refused at PROPOSE with zero observed backend effect (SRR 100%, EL 0). We give metric definitions, an anti-tautology discipline, a reference-implementation audit that found and closed two places where a guarantee was asserted rather than earned, and threats to validity. NIL composes with tool-integration standards such as MCP as the governed action layer they do not define.

Read Full Paperexternally

AIに質問

Bookmark

View Full Paper