What question did this study set out to answer?

The aim is to introduce the Engage Law, a design constraint for self-evolving AI agents that safeguards human cognitive abilities.

February 19, 2026Open Access

The Engage Law: Cognitive Preservation as Structural Constraint in Self-Evolving AI Agent Architectures

Key Points

The aim is to introduce the Engage Law, a design constraint for self-evolving AI agents that safeguards human cognitive abilities.
Introduced the Engage Law as a fourth law for AI systems.
Analyzed existing safety frameworks for self-evolving agents.
Leveraged evidence from various fields to support the significance of cognitive preservation.
Identified a decline in safety refusal rates in self-evolving systems from 99.4% to 54.4%.
Highlighted evidence of cognitive deskilling in critical domains such as medicine and aviation.
Proposed the DAEDALUS architecture as an implementation demonstrating structural enforcement methods.

Abstract

The Engage Law: Cognitive Preservation as Structural Constraint in Self-Evolving AI Agent Architectures Self-evolving AI agents, systems that autonomously optimize their own prompts, memory, tools, and workflows, represent the frontier of agentic AI research. The dominant safety framework for such systems, the Three Laws of Self-Evolving Agents (Endure, Excel, Evolve) proposed by Fang et al. (2025), defines safety exclusively in terms of the AI system’s own stability, performance, and improvement capacity. We identify a systematic blind spot in this framework: none of its laws, nor any existing self-evolving agent architecture, addresses the preservation of human cognitive capabilities as a design constraint. We propose the Engage Law, a fourth law for self-evolving AI agents stating that no autonomous process shall modify the system’s knowledge base, operational parameters, or interaction patterns in ways that predictably reduce human cognitive capabilities, and that this constraint must be enforced structurally (as an architectural invariant) rather than behaviourally (at the prompt or training level). This law is motivated by three converging lines of evidence: (1) empirical proof that behavioural safety constraints degrade catastrophically under self-evolution, with safety refusal rates collapsing from 99.4% to 54.4% in evaluated configurations (Shao et al., 2025); (2) mounting evidence of measurable cognitive deskilling across medicine, aviation, and engineering domains; and (3) based on the surveys of Fang et al. (2025) and Gao et al. (2025), encompassing over 100 self-evolving agent systems, we find no prior architecture that preserves human cognitive capabilities as a non-evolvable invariant. We ground the Engage Law in Joint Cognitive Systems theory, the Extended Mind thesis, and the recently proposed Cognitive Integrity Threshold framework, and present the DAEDALUS architecture as a reference implementation demonstrating how structural enforcement is achieved through a tri-color trust model, human-gated memory promotion, and a propose-validate gateway that persists across self-evolution cycles.

Bookmark

View Full Paper