This work develops a unified theoretical framework for long-context degradation in transformer-based large language models, combining information geometry and extreme value theory. The central quantity is the observer entropy Sₒbs (p_θ, ε) defined via Kullback–Leibler divergence under coarse-graining. The main result (Bridge Theorem) establishes the quadratic scaling law Sₒbs = ½ ε² v (θ) ^⊤ I (θ) v (θ) + O (ε³), showing that information loss is governed at leading order by the Fisher information matrix. On the probabilistic side, attention collapse is analysed using extreme value theory for weakly dependent logit maxima. This yields a closed-form probabilistic risk law and leads to a Fundamental Impossibility Theorem: for any finite signal strength, observer entropy vanishes in the long-context limit, implying that full information retention is impossible under softmax attention. These results are connected through bounds of the form Sₒbs (L) ≤ c₂ e^μL/L, where μL = μₛ − σ√ (2 log Lₑff), providing a unified information-theoretic characterization of long-context collapse. A control-theoretic response is formulated via the CPL 4. 0 phase-aware governor, which enforces a hard context cap, guarantees entropy contraction, and achieves sub-linear fragmentation bounds. The paper includes formal statements, proofs (complete or conditional where explicitly stated), numerical verification, and a proposed experimental protocol for validation on real LLM systems. Note: This Zenodo version omits non-scientific funding information present in the public DPID release. The scientific content is identical.
Vladimir Khomyakov (Tue,) studied this question.