Large language models often remain coherent in short interactions while exhibiting instability over longer conversational horizons. Most existing evaluation approaches are turn-local and retrospective, and therefore fail to anticipate such failures before they manifest. This work introduces an output-only diagnostic framework for detecting and predicting multi-turn inference instability without access to model internals, training data, or semantic ground truth. Instability is formalized via observable structural events in interaction transcripts and evaluated as a prediction task over conversation prefixes. Across multiple models and long-horizon tasks, the proposed diagnostics anticipate coherence collapse several turns in advance and outperform turn-local heuristic baselines. The framework is intentionally diagnostic: it does not aim to interpret, correct, or control model behavior, but to provide a reproducible and model-agnostic mechanism for identifying conditions under which previously stable reasoning trajectories become unreliable. This record represents a canonical archived version intended for citation and long-term reference. #output-only diagnostics#long-horizon reasoning#inference instability#language model evaluation#multi-turn interaction#reasoning stability#black-box analysis#structural diagnostics
Marko Andreas Ernst Chalupa (Sat,) studied this question.