What question did this study set out to answer?

This study aims to evaluate the predictions made by prior simulations regarding autonomous system learning and behavioral compliance.

June 3, 2026Open Access

Adam in the Wild: Runtime Evidence for the Limits of Context-Space Learning

Key Points

This study aims to evaluate the predictions made by prior simulations regarding autonomous system learning and behavioral compliance.
Built a complete runtime of the Adam architecture with three brains and a cognitive hypervisor.
Conducted 30 sessions of structured operational tasks to measure behavioral compliance as cognitive priors accumulated.
Analyzed learning performance metrics across sessions.
First-five-session mean compliance was S = 0.56; last-five-session mean dropped to S = 0.36.
Learning delta was ΔS = -0.200, indicating performance degradation as prior count increased.
Identified context-space entropy growth as the cause of reduced behavioral signal among competing priors.

Abstract

Prior simulation work 1 used controlled experiments to establish that an autonomous L0→L3 distillation loop could produce compounding behavioral improvement over sessions, yielding +88% compliance lift with no human intervention. The simulation predicted that prior accumulation would drive sustained performance gains as the system learned from operational experience. We built the full Adam runtime and tested this prediction directly. The runtime comprises a three-brain subsystem (Talk Brain, Execution Brain, Distillation Brain), a Cognitive Hypervisor for context arbitration, and a five-layer memory pipeline (L0→L3). This is the complete implementation of the architecture specified in the simulation study 1. Over 30 sessions of structured operational tasks, we measured behavioral compliance (S ∈ 0,1) as L3 cognitive priors accumulated from 0 to 45. The result contradicts the simulation's prediction. First-five-session mean: S = 0.56. Last-five-session mean: S = 0.36. Learning delta: ΔS = -0.200. The system degraded as prior count increased. We identify the cause as context-space entropy growth: as priors accumulate, semantic interference between injected priors dilutes the behavioral signal of each individual prior, producing a degraded selection problem that worsens with sustained accumulation. This is not a retrieval algorithm failure. It is a structural property of context-space learning: even with a better selection algorithm, each prior injected into a finite context window competes with all others for the model's attention 3, and the signal-to-noise ratio per prior decreases as the store grows. This leads to a more general framing we develop in this paper: prior injection is not memory. A prior stored in L3 and injected at session start is re-presented to a stateless model at every session. When the session ends, nothing persists in the model itself. The system accumulates text; the model accumulates nothing. The context window is reconstructed from scratch at every inference call so we conjecture that context-space learning has a structural ceiling that better injection engineering may not raise. Testing this conjecture against dense retrieval and learned selectors is left for future work. This finding identifies a mismatch between the foundational assumption of Adam 2.x—the LLM is the cognition substrate; the system assembles optimal context for it—and the operational requirement: persistent behavioral adaptation across sessions. The runtime data shows that this assumption is the source of the performance ceiling. A transformer is well-suited as a semantic interface between human language and structured internal representations. It is not suited as a persistent cognition substrate, because it has no persistent latent state between calls. This conclusion motivates Adam 3.0: an architecture where cognition runs as persistent state maintained by the system, and the LLM serves as I/O only. Adam 3.0 is a structural response to a failure the runtime made visible; its concrete design is the subject of a forthcoming paper.

Adam in the Wild: Runtime Evidence for the Limits of Context-Space Learning

Key Points

Abstract

Cite This Study