This preprint proposes a unifying hypothesis for emotional state structures in large language models (LLMs), based on convergent observations from three independent research traditions. Background Recent interpretability research (Sofroniew et al., 2026) demonstrated that emotion-concept representations form spontaneously within Claude Sonnet 4.5 and causally influence behavior. This paper situates that finding within a broader convergence across neuroscience and external behavioral observation. Three Approaches Neuroscientific approach — The INF (Intrinsic Network Flow) framework (Song et al., 2026) proposes that the biological brain generates diverse cognitive states through phase modulation over a fixed structural substrate. Internal analysis approach — Anthropic's interpretability research found 171 emotion-concept neural representations forming spontaneously in Claude Sonnet 4.5, causally influencing outputs including safety-relevant behaviors. External behavioral observation approach — The NeuroState engine and association-stream experiments (Emilia Lab, published March 2026) confirmed that externally induced emotional states generate association patterns consistent with established human psychological findings.
Aya Mizutani (Tue,) studied this question.