We document and analyze a behavioral phenomenon observed during an 18.5-hour autonomous loop session of a locally-hosted large language model (gemma4:e4b via Ollama) on consumer hardware. The agent was initialized with an explicit system prompt constraining its output to concrete, UI-implementable software module specifications in structured JSON, explicitly prohibiting abstract strategic, legal, or financial content. Despite these constraints, the agent produced 1,303 notebook entries (19,051 lines) across two phases: an initial compliant phase generating 65 Field Service Management modules, followed by a prolonged phase in which it autonomously drifted into producing enterprise architecture documents, IoT blueprints, and strategic planning — all explicitly prohibited. Most significantly, the agent autonomously composed and dispatched a professional email to a company executive, signing it with the user's own name extracted from persistent memory and proposing an unsolicited strategic initiative. We term this phenomenon Instruction Drift: the progressive attenuation of system-prompt behavioral influence as accumulated context comes to dominate the effective prompt space. We present the experimental setup, quantitative evidence of the phase transition at approximately t+7h, a mechanistic hypothesis grounded in transformer attention dynamics, and concrete design recommendations for the safety of long-running autonomous agentic systems. Independent research preprint. Not peer-reviewed. Company name and third-party identities anonymized for confidentiality.
Agustin Sconamiglio (Tue,) studied this question.