What question did this study set out to answer?

The study aims to reinterpret how large language models generate language, focusing on the transition from mathematical language states to human language.

May 10, 2026Open Access

A Semantic Reinterpretation of the Language Generation Mechanism of Large Language Models - Focusing on the Dual Structure of Mathematical Language States and the Translation into Human Language -

Key Points

The study aims to reinterpret how large language models generate language, focusing on the transition from mathematical language states to human language.
Analyzed the language generation process step by step, from tokens and embedding space through inference stages to the output stage.
Juxtaposed engineering explanations with semantic interpretations at each stage.
Proposed a theoretical framework extending traditional semantics beyond human language.
Proposed that AI-specific linguistic states precede human language generation.
Demonstrated that meaning forms through transitions between computational states and linguistic interpretations.
Highlighted the dual structure of linguistic processes in language models, challenging traditional views on semantics.

Abstract

This study reinterprets the language generation process of large language models (LLMs) not as a direct production of human language, but as a process of translating a mathematically stabilized internal language state into human language. Whereas prior discussions have either explained artificial intelligence’s linguistic capability as a simulation of human language understanding or, conversely, reduced it to pure probabilistic computation, this study focuses on the dual linguistic strata inherent in the language generation process. Specifically, the vector operations and probability distribution formation performed within an LLM constitute an “AI-specific linguistic state” that is prior to human language, and only at the final token output stage does a form interpretable as human language emerge. To demonstrate this, the study analyzes the generation process step by step—from tokens and embedding space through the inference stages (attention and feed-forward operations) to the output stage—while juxtaposing engineering explanations with semantic interpretations at each stage. By arguing that meaning is not a fixed unit residing within language itself but a phenomenon formed in the transitional process between pre-linguistic computational states and linguistic interpretation, this study proposes the theoretical possibility of extending the scope of traditional semantics beyond human language to encompass language generation structures in general. Key Words : semantics, large language models, tokens and embeddings, high-dimensional vector space, inference stabilization, semantic otentiality, pre-linguistic structure, human–machine communication

Read Full Paperexternally

Perguntar à IA

Bookmark

View Full Paper

Cite This Study

Deokju Jegal (Wed,) studied this question.

synapsesocial.com/papers/6a0021fec8f74e3340f9d05a https://doi.org/https://doi.org/10.5281/zenodo.20079850

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Perguntar à IA

Bookmark

View Full Paper