We develop a mathematical research programme for language models organized around the Constraint–Projection–Limit (CPL) principle: useful behaviour in a high-dimensional stochastic system is selected when training, architecture, data, and finite precision force observables through low-entropy structural projections, after which concentration or stability renders the residual fluctuations predictable. The rigorous core consists of three selection statements. First, an observable-entropy collapse law shows that a Lipschitz function class determinizes when its metric entropy is subcritical relative to the ambient concentration rate. Second, a no-free-semantics theorem shows that robust separation of two positive-mass semantic classes on a normal Lévy family is impossible for uniformly Lipschitz logits unless Lipschitz scale, depth, class rarity, or the concentration model itself changes. Third, an exact projection theorem shows that the excess next-token log loss incurred by replacing the full context X with a structural code T(X) is precisely the conditional mutual information I(Y;X ∣ T). The projection step of CPL becomes an operational quantity rather than a metaphor. The remaining results are organized into three tiers. Tier I contains full theorems: concentration foundations and depth lower bounds, path-wise martingale variance, chain-rule and de Finetti diagnostics, Pinsker control of independence error, curvature comparison, G-graded spectra, detailed balance, Lipschitz Johnson–Lindenstrauss factorization, and cyclic position-encoding decomposition. Tier II gives conditional architectural propositions on biclustering, Kanerva collision bounds, softmax free energy, and non-reversible attention. Tier III states conjectures: transformer Lyapunov stability, feedback-channel bit budgets, factor-model rank ceilings, average-opinion bias, Fisher–Rao concentration, and the CPL universality target. Seven empirical tests with explicit kill criteria are proposed. The contribution is to isolate quantities that can be proved, measured, or falsified: Lipschitz scale, metric entropy, conditional mutual information, entropy production, Lyapunov spectra, effective rank, and path-wise variance.
Miquel Noguer Alonso (Fri,) studied this question.