The AI-agent field is optimizing the wrong variable. Enormous effort goes into making agents more capable — more tools, more autonomy, longer memory — while the property that decides whether an agent is safe to put near real data, money, or customers goes largely unbuilt: governability. The evidence is stark: in a systematic survey of 30 deployed agents, documented across 45 fields each, third-party safety testing is documented for only 3, and 25 of the 30 disclose no internal safety-evaluation results at all. The controls that do exist in the literature are fragmented — risk taxonomies (OWASP, NIST, Microsoft), process frameworks (NIST AI RMF, Google SAIF), vendor recommendations (Anthropic, OpenAI), and isolated point designs (CaMeL, execution isolation, guardrail toolkits). To our knowledge, no single publicly described system combines structural containment, mechanical tool-layer enforcement, budget and kill-switch controls, tamper-evident audited memory, and an independent critic — and then publishes what happened when it was adversarially tested. This paper describes one that does. The Governed Agent Doctrine is five enforcement rules plus a governed memory substrate, implemented and running as a single-operator personal AI operating system. Each rule is enforced mechanically, outside the language model, at the tool and infrastructure layer — because a control the model can talk its way around is not a control. The paper reports the system's own adversarial audit, including the uncomfortable parts: load-bearing behaviours held at 8/8 under probing, but the memory system's forgetting initially scored 0.5 — it re-asserted a corrected fact under paraphrase, until the guarantee was moved into the read path. The argument: the integration itself, and the discipline of auditing it, is the contribution. Governability is not a feature you add later; it is a property you build in by construction.
Shakir Hashim (Thu,) studied this question.