What question did this study set out to answer?

This work aims to explain model collapse in generative AI systems through the lens of information theory.

May 4, 2026Open Access

Possible Entropic Limits of Iterative Computation in Generative AI: Model Collapse Explained by the Data Processing Inequality and the AI Theorem

Key Points

This work aims to explain model collapse in generative AI systems through the lens of information theory.
Utilized Shannon’s Data Processing Inequality (DPI) to analyze iterative training of synthetic data as a Markov chain.
Introduced the AI conceptual theorem to delineate stability limits for computational systems under constraints.
Explored cumulative information degradation due to finite precision and bounded capacity in AI systems.
Demonstrated that mutual information must decrease over iterations, predicting exponential decay tendencies.
Established that information loss is due to intrinsic system constraints rather than specific architectural choices.
Provided a unified framework that guides the design of more stable AI systems to mitigate degradation.

Abstract

Generative AI systems trained on synthetic data exhibit progressive degradation known as model collapse. This paper provides a theoretical explanation of this phenomenon using Shannon’s Data Processing Inequality (DPI), modeling iterative synthetic-data training as a Markov chain of lossy transformations. We show that mutual information with respect to the original data distribution must decrease monotonically, yielding qualitative predictions for exponential decay tendencies and indicating that information loss arises from general finite-precision and capacity constraints rather than from any specific architectural mechanism. Building on this analysis, we introduce the AI conceptual theorem, a generalized stability limit for computable systems. The theorem states that any purely computational system that generates outputs iteratively under finite precision, bounded capacity, and without external low-entropy input must experience cumulative information degradation after a finite number of steps. DPI-based collapse emerges as a special case of this broader principle. The framework is intended as a conceptual information-theoretic perspective rather than a fully formalized theory, with several assumptions intentionally simplified to highlight the underlying entropic mechanism. The results should therefore be interpreted as principled limits that motivate further empirical and mathematical investigation rather than as definitive closed-form predictions. Together, DPI and the AI Theorem provide a unified information-theoretic framework for understanding degradation in synthetic training, long-horizon inference, and other iterative computational processes. The resulting predictions are quantitatively falsifiable and offer guidance for designing more stable and information-preserving AI systems.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Pavel Straňák

Journals

Symmetry

Actions

Institutions

Czech Radio

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Possible Entropic Limits of Iterative Computation in Generative AI: Model Collapse Explained by the Data Processing Inequality and the AI Theorem

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider