abstractThis paper proposes a Stage~3 theoretical framework for understanding largelanguage models (LLMs) through geometry and mathematical physics. Startingfrom a vocabulary embedding matrix E R^N d, the paperidentifies an intrinsic token semantic space Rʳ, where rrepresents the effective semantic rank of the embedding representation. Byadding the token sequence dimension as a temporal coordinate, we create apseudo-time dimension; as such, the first ambient space is extended to atemporal-semantic ambient space R^r+1. Observed language is thentreated as discrete token samples or trajectories approximated, at first order, by a language manifold M R^r+1. A scalar semantic potential is introduced on the language manifold;its manifold gradient defines a tangent vector = M describingthe local direction and rate of steepest semantic change. In thisformulation, the r-dimensional semantic space is analogized to a spatialfield, while the token sequence dimension is treated as a pseudo-time domain. A token sequence can therefore be expressed as an ordered point cloud ortrajectory embedded in the ambient space R^r+1. However, language with semantic meaning tends to concentrate in smaller regions of thisambient space, which can be approximated by continuous manifolds. Thediffusion equation provides a natural first candidate for fitting continuousmanifolds to discrete linguistic samples, while wave and transport equationscapture semantic propagation, structure preservation, and directional movementunder contextual constraints. Together, these equations form a PDE-basedframework for modeling language dynamics on the language manifold. Training is interpreted as an inverse problem: estimating the language manifold, the scalar potential structure, and the coefficient fields of the governing PDEfrom human-generated language. Inference is interpreted as the forwardproblem: a prompt imposes boundary or initial conditions and selects acontinuation trajectory on the learned manifold. The framework offers a pathfrom statistical pattern recognition toward a predictive theory of languagedynamics grounded in manifold geometry and PDEs. abstract
Lijia Zhang (Wed,) studied this question.