This white paper formulates statistical learning as a variational principle on the space of probability measures over the parameters of a model. The state of a learning system is an ensemble ρ; training is the least‑action descent of a free‑energy functional F = U − Tσ, equivalently the Wasserstein gradient flow of F (the Fokker–Planck/Langevin evolution), whose equilibrium is the Gibbs measure and whose dissipation is one with the structural non‑invertibility of the learning map. The excess risk above the Bayes floor decomposes exactly into three kinematic coordinates—expressivity, induction, search. The document introduces no new structure and proves no new theorems: it reorganises established results—the Gibbs variational principle, Jordan–Kinderlehrer–Otto gradient flows, energy–dissipation, PAC–Bayes, information geometry—around a single least‑action statement in the condensed style of Landau's Mechanics, and establishes its compatibility with the author's framework Coordinates of Statistical Learning (in preparation). Method here follows one explicit commitment, the peratic principle: a theory should inscribe its own boundary rather than feign closure—stated at the outset and invoked wherever the framework marks an edge. The contribution is organisational: the unification of the kinematic chart with the dynamic variational principle, and the correspondence of irreversibility with non‑invertibility.
HORACIO BRIZUELA (Tue,) studied this question.