Transformer architectures dominate modern artificial intelligence systems but exhibit persistent limitations when operating over long contexts. As sequence length increases, models experience degradation in long-range dependency tracking, token drift, and instability in internal representations. This paper provides a structural interpretation of these limitations using the admissibility framework of the Paton System. Rather than treating context limits purely as engineering constraints, the analysis interprets Transformer context scaling as an admissibility-bounded recursion problem. Each model state can be treated as a compressed node representation propagated through an admissible structural frame defined by architectural constraints. Continuation of the system state is permitted only if the next state satisfies admissibility within that frame. When context length grows beyond the compression capacity of the node representation, admissibility conditions fail. The resulting breakdown manifests as token drift, loss of long-range dependency tracking, and degradation of attention coherence. This interpretation connects observed scaling limits of Transformer models to a general structural principle: recursive computational systems maintain continuity only while state compression remains admissible within the governing frame constraints. The result provides a domain-neutral structural explanation for context limits in modern AI systems and situates Transformer scaling behaviour within the broader framework of the Paton System.
Andrew John Paton (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: