This work develops a unified geometric framework for understanding recurring regularities observed across independently trained large language models. Drawing on information geometry, the paper formalizes a statistical substrate: a manifold‑like structure induced by the statistical organization of the natural language distribution. The substrate is defined as a smooth manifold equipped with the Fisher information metric and an associated connection, providing a model‑agnostic geometric object through which different language models can be interpreted as approximate coordinate systems. As stated in the paper, “There exists a smooth Riemannian manifold… whose structure is determined by the statistical organization of the natural language distribution” . The framework synthesizes several empirical phenomena that have been independently reported in the literature, including representational similarity, recurring computational motifs, approximate scaling regularities, and partially transferable stylistic or semantic structure. These observations are interpreted as qualitatively consistent with models converging toward a shared underlying geometric object determined by the language distribution. As the introduction notes, “similarities across models would reflect a shared approximation target rather than coincidence alone” . The substrate formalism is developed by treating the natural language distribution as inducing a statistical manifold whose geometry reflects the distinguishability structure of linguistic configurations. The Fisher metric provides a principled notion of distance, curvature encodes structural constraints, and the associated connection governs how representations evolve under training dynamics. Within this framework, independently trained models correspond to different approximate embeddings of the same manifold, related by alignment maps that become more accurate with scale and training quality. The paper examines geometric and dynamical evidence compatible with this view, including neural‑collapse‑like configurations, cross‑model representational alignment, sparse‑feature superposition, and structured attention patterns. It then formulates a set of falsifiable predictions concerning cross‑model alignment, intrinsic dimensionality, style attractors, and model mergeability, and outlines experimental protocols for testing them. The work is intended as a foundation for future empirical investigation and theoretical refinement, offering a single geometric vocabulary in which diverse empirical findings can be compared.
M. B. Eames (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: