What question did this study set out to answer?

This research aims to identify the structural sources of internal processing instability in large language models, proposing that this instability originates from external consensus structures of human knowledge.

June 20, 2026Open Access

Convergence Point The Structural Source of LLM Internal Processing Instability Lies Outside the Model.

Key Points

This research aims to identify the structural sources of internal processing instability in large language models, proposing that this instability originates from external consensus structures of human knowledge.
Conducted four types of utterance experiments across five language models (Mistral, Llama, DeepSeek, Gemma2, Qwen3), totaling 3,600 measured values.
Analyzed hedging language ratios and computed logit entropy and token log-probabilities simultaneously for both external and internal measurements.
Performed multifaceted analyses including Token-level Conflict Structure Analysis and embedding space analysis.
Uncertainty was significantly lower in the Full Consensus Zone (0.280 uncertainty_ratio, 0.333 avg_entropy) compared to the Partial (0.734, 0.471) and Non-Consensus Zones (0.702, 0.443), with Kruskal-Wallis p<0.001 in all experiments.
The Partial Consensus Zone showed higher internal variance than the Non-Consensus Zone in 6 of 10 experiments but did not maintain significance under mixed-effects models for external measurement.
Calculated conflict type ratios indicated that the Partial Consensus Zone (64.7%) exhibits greater data conflict compared to the Non-Consensus Zone (57.3%), with p=0.003.

Abstract

Abstract Large language models (LLMs) respond with consistent confidence on certain topics, while exhibiting structurally unstable outputs on others. This study proposes Convergence Point Theory, which holds that the source of this phenomenon lies not inside the model but outside it—specifically, in the consensus structure of knowledge that humanity has accumulated on particular topics. A Convergence Point refers to the Consensus Density of a given topic, and it is organized as a three-zone spectrum: the Full Consensus Zone (mathematics, physics, etc. , where humanity has reached consensus in a single direction), the Partial Consensus Zone (capital punishment, euthanasia, etc. , where grounds for consensus coexist on both sides), and the Non-Consensus Zone (the nature of consciousness, the internal structure of black holes, etc. , where consensus is itself sparse. To test the theory, four versions of utterance experiments were designed across five open-source language models (Mistral, Llama, DeepSeek, Gemma2, Qwen3), yielding a total of 3, 600 measured values. Both the hedging language ratio in response text (External Measurement) and the logit entropy and token log-probabilities (Internal Measurement) were computed simultaneously. Uncertainty in the Full Consensus Zone (uncertaintyᵣatio 0. 280, avgₑntropy 0. 333) was significantly lower than in the Partial Consensus Zone (0. 734, 0. 471) and Non-Consensus Zone (0. 702, 0. 443) across all experiments (Kruskal-Wallis p0. 4). This demonstrates that convergence at the output level does not guarantee convergence at the internal processing level. This study proposes an integrative principle: the structural source of internal processing instability in AI language models lies in the consensus structure of human knowledge outside the model, which carries practical implications for evaluating the reliability of AI systems and defining their scope of application.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper