Anthropic’s April 2026 Claude Mythos Preview release established a new operational threat category: frontier AI systems whose extended-context reasoning, recursive self-correction, native system-tool integration, and agentic scaffolding render dominant AI safety paradigms—RLHF, output filtering, contractual access vetting, human-in-the-loop supervision—insufficient as sole controls. This paper develops a defense-in-depth reference architecture against that category, structured around four named contributions: a five-indicator operational definition of the Mythos-class (capability conjoined with scaffold, access pattern, autonomy depth, and persistence); the Mythos-Class Posture Rubric (MCPR), a three-tier detection framework spanning evaluation, deployment, and runtime with explicit routing to mitigation layers; a four-layer mitigation stack comprising the Vetted-Access Operational Pattern (VAOP), Authority-Bound Output Release (ABOR) cryptographically grounded in FIPS 203/204/205 post-quantum primitives, and the Compute-Plane Isolation Profile (CPIP); and an integrated architecture that crosswalks to the NIST AI Risk Management Framework, NIST Cybersecurity Framework 2.0, and CISA Zero Trust Maturity Model 2.0. The architecture is applied to three deployment surfaces—post-quantum cryptography migration, federal AI supply-chain assurance, and critical-infrastructure operational technology defense—demonstrating that the four contributions generalize across heterogeneous operational contexts. The contribution is a reference design rather than a deployed system; limitations, falsifiability criteria, and a research agenda for empirical refinement are developed.
Building similarity graph...
Analyzing shared references across papers
Loading...
Robert Campbell
Prince George's County Public Schools
Computers
Prince George's County Public Schools
Building similarity graph...
Analyzing shared references across papers
Loading...
Robert Campbell (Fri,) studied this question.
synapsesocial.com/papers/6a12966a48a0ea166567339c — DOI: https://doi.org/10.3390/computers15060331