What question did this study set out to answer?

To explore the structural reasons behind the generative capabilities of transformer models.

March 15, 2026Open Access

Why Generative AI Generates: A structural explanation for emergence in transformer models

Key Points

To explore the structural reasons behind the generative capabilities of transformer models.
Proposes a structural explanation based on cyclic processes.
Applies principles of layer interaction and processing to transformer models.
Introduces three testable predictions regarding layer performance and model configuration.
Identifies that middle layers contribute most to generative capabilities.
Demonstrates positive and super-additive information retention across layers.
Suggests deeper models outperform shallower models in generative tasks.

Abstract

The AI research community understands how transformer models work but not why they generate. Why does stacking layers produce qualitatively new capabilities? Why do emergent abilities appear at sharp thresholds rather than gradually? Why does in-context learning work when it was not explicitly designed? This paper proposes that the answer is structural. A companion paper (van der Klein, 2026d) derives a general principle: any self-similar cyclic process generates novelty because inner cycles irreversibly change the substrate on which outer cycles operate. This paper applies that principle to transformer models. Each layer applies four sequential operations (query, key, attention-weighted value, output projection) to the output of the previous layer. Each layer's processing changes the representation on which the next layer operates. The model generates because the recursive structure prevents it from merely retrieving. Three testable predictions follow: (1) layer ablation should show non-linear degradation with middle layers contributing most, (2) information distance per layer should be strictly positive and super-additive across layers, (3) deep narrow models should generate more than shallow wide models at matched parameter count.

Why Generative AI Generates: A structural explanation for emergence in transformer models

Key Points

Abstract

Cite This Study