What question did this study set out to answer?

This research aims to develop a thermodynamic and dynamical systems framework to understand the internal dynamics of large language models.

June 5, 2026Open Access

View Full Paper

The Standard Model of Transformers: A Thermodynamic and Dynamical-Systems Framework for Understanding Large Language Models

HFHiroto Funasaki

Key Points

This research aims to develop a thermodynamic and dynamical systems framework to understand the internal dynamics of large language models.
Conducted 33 experiments on Qwen2.5 models with 0.5B and 1.5B parameters
Analyzed the roles of attention functions and feed-forward networks in the model dynamics
Investigated the Lyapunov exponent and thermodynamic properties related to hallucination detection.
Attention functions as a contractive force with negative specific heat (dU/dT ≈ −18)
Feed-Forward Networks contribute 67–73% of the total representational force
Achieved AUC = 0.88 for hallucination detection using a thermodynamic firewall.

Abstract

I present a systematic experimental program that applies thermodynamics, dynamical systems theory, and cosmological analogies to characterize the internal dynamics of Transformer-based large language models (LLMs). Through 33 experiments on Qwen2.5 models (0.5B and 1.5B parameters), I discover that: Attention functions as a contractive "gravitational" force with negative specific heat (dU/dT ≈ −18), a universal constant independent of model scale; Feed-Forward Networks contribute 67–73% of the total representational force, functioning as "dark energy"; The Lyapunov exponent is consistently negative (λ = −0.05), proving Transformers are stable attractors; Information exhibits anti-lensing (cos = −0.15), repelling from high-norm tokens; A thermodynamic firewall monitoring PR×T variance achieves AUC = 0.88 for hallucination detection; Dark energy suppression reveals a critical phase transition at β = 0.57. These findings are synthesized into the Standard Model of Transformers, a unified physical framework with direct engineering applications in hallucination detection, model compression, and adversarial robustness. Code: https://github.com/hafufu-stack/Standard-Model-of-Transformers Acknowledgments This research was conducted entirely independently, without institutional affiliation or corporate funding. The author currently faces financial constraints that make it increasingly difficult to maintain subscriptions to AI services essential for this line of research. To sustain and improve the quality of future work, the author is actively seeking community sponsorship. Details are available at https://github.com/sponsors/hafufu-stack.

Ask AI

Helpful

Bookmark

View Full Paper

Ask AI

Helpful

Bookmark

View Full Paper

The Standard Model of Transformers: A Thermodynamic and Dynamical-Systems Framework for Understanding Large Language Models

Key Points

Abstract

Cite This Study

Also Consider

Also Consider