Status: NeurIPS 2026 submission under double-blind review. Author identity anonymized. Self-improving AI agents lack runtime safeguards that prevent evaluation drift, fragile outcome acceptance, and unbounded parameter updates from compounding into catastrophic policy degradation. We study cognitive policy oscillation -- strategy degradation caused by hallucinated feedback -- and map an oscillation phase diagram for self-improving agents (384 synthetic + 32 LLM conditions). A sharp instability boundary emerges at moderate step sizes (h approx 0.2), yielding a phase-aware deployment rule. WhyLab: a conditional causal audit framework activating only in the unstable regime: C1: Information-theoretic drift index C2: Sensitivity filter combining E-values and partial R2 bounds C3: Lyapunov-bounded damping controller Our contribution is boundary delineation: identifying when intervention is warranted, not universal improvement. In controlled unstable regimes, the audit reduces oscillation by 76%. On adversarial LLM tasks, fixed C2 reduces regressions by 44% on Gemini 2.0 Flash (p=0.014, Bonferroni-adjusted p=0.042). In the stable regime (SWE-bench Lite, 10,500 episodes), the audit remains inactive, as predicted. Docker evaluations on Gemini 2.0/2.5 Flash show zero observed C2-caused regressions. Change log (v2 vs v1): Abstract condensed to boundary-delineation framing (honest null-result acknowledgement); C2 targeted SWE-bench selective follow-up transparently reported (no net gain vs fixed C2); Docker Gemini 2.5 Flash full Docker evaluation added; phase-aware deployment rule formalized; references and deployment checklist expanded.
Building similarity graph...
Analyzing shared references across papers
Loading...
Anonymous Author
American Foundation for the Blind
Building similarity graph...
Analyzing shared references across papers
Loading...
Anonymous Author (Sun,) studied this question.
www.synapsesocial.com/papers/69e71423cb99343efc98d8f9 — DOI: https://doi.org/10.5281/zenodo.19063714
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: