Current evaluation practices for advanced artificial agents emphasize performance metrics such as reward, accuracy, or task completion. These metrics often fail to detect structural instabilities that arise under long-horizon operation, especially when optimization signals are sparse, misleading, or adversarial. In this work, we introduce viability horizons: a quantitative framework for detecting impending system collapse in agents subject to irreversible selection pressure. Building on a history-level perspective, we show that collapse is not necessarily preceded by performance degradation, but by loss of internal coherence across time, memory, and control channels. We formalize collapse as a failure to sustain a coherent system history under irreversible updates and demonstrate that alignment, reward maximization, and capability scaling are neither necessary nor sufficient conditions for long-term viability. We propose operational metrics for coherence drift, delayed failure, and irreversibility-induced brittleness, and outline concrete experimental protocols for detecting collapse regimes in contemporary agentic systems. These results reframe AI risk as a structural stability problem rather than a behavioral or normative one. Keywords: artificial intelligence stability, long-horizon coherence, irreversible updates, system collapse, alignment failure modes, cognitive persistence, information loss, entropy accumulation, dynamical systems, AI safety theory
Building similarity graph...
Analyzing shared references across papers
Loading...
Jonah Brent
Building similarity graph...
Analyzing shared references across papers
Loading...
Jonah Brent (Thu,) studied this question.
www.synapsesocial.com/papers/6974616cbb9d90c67120b3c6 — DOI: https://doi.org/10.5281/zenodo.18333931