This study reinterprets the language generation process of large language models (LLMs) not as a direct production of human language, but as a process of translating a mathematically stabilized internal language state into human language. Whereas prior discussions have either explained artificial intelligence’s linguistic capability as a simulation of human language understanding or, conversely, reduced it to pure probabilistic computation, this study focuses on the dual linguistic strata inherent in the language generation process. Specifically, the vector operations and probability distribution formation performed within an LLM constitute an “AI-specific linguistic state” that is prior to human language, and only at the final token output stage does a form interpretable as human language emerge. To demonstrate this, the study analyzes the generation process step by step—from tokens and embedding space through the inference stages (attention and feed-forward operations) to the output stage—while juxtaposing engineering explanations with semantic interpretations at each stage. By arguing that meaning is not a fixed unit residing within language itself but a phenomenon formed in the transitional process between pre-linguistic computational states and linguistic interpretation, this study proposes the theoretical possibility of extending the scope of traditional semantics beyond human language to encompass language generation structures in general. Key Words : semantics, large language models, tokens and embeddings, high-dimensional vector space, inference stabilization, semantic otentiality, pre-linguistic structure, human–machine communication
Deokju Jegal (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: