What does this research mean for the field?

Transformer-based Large Language Models function internally as a Neural Von Neumann Machine with a Fetch-Decode-Execute-Store pipeline and identifiable registers, where factual and arithmetic knowledge is computed at intermediate layers but often suppressed before the final output. Novelty: ClaimNovelty.NOVEL_FINDING. Consensus alignment: ConsensusAlignment.ESTABLISHES_NEW_DIRECTION.

What question did this study set out to answer?

This research aims to elucidate the mechanisms behind LLM hallucinations and explore internal computational structures using physics principles.

May 19, 2026Open Access

Project Aletheia V8: The Neural Von Neumann Machine — From Hallucination Control to Reverse-Engineering the LLM's Internal CPU

Key Points

This research aims to elucidate the mechanisms behind LLM hallucinations and explore internal computational structures using physics principles.
Conducted a 209-phase investigation using GPT-2 and Qwen2.5-14B as experimental platforms.
Applied entropy-based routing and theoretical interventions to examine the function of transformers as Neural Von Neumann Machines.
Utilized mathematical equations and principles to derive laws governing internal processor behavior.
Established nine fundamental laws relating to hallucination and restoration of factual knowledge.
Achieved 100% arithmetic accuracy on single-digit addition through specific interventions.
Demonstrated a shift in functional registers enhancing the computational pipeline efficiency.

Abstract

This paper presents Project Aletheia, a systematic 209-phase investigation of LLM hallucination and internal computation through the lens of condensed matter physics. Using GPT-2 (124M parameters) as a "particle accelerator" and scaling to Qwen2.5-14B, I establish nine fundamental laws, ten theorems, and seven principles governing how transformers suppress factual knowledge, how to deterministically restore it through inference-time geometric intervention, and how the transformer's layer stack functions as a Neural Von Neumann Machine. V8 New Discoveries (Phases 166–209, Seasons 35–42): The Aletheia Engine (P166–P179): Built a unified autonomous pipeline combining entropy-based routing (AUC=0.882), the "Trinity" (Surgery+Code+FGA preserving both fact and arithmetic accuracy), and the def f(): return prefix achieving 100% arithmetic. The Grand Unified Sword Equation fits 6 models with R²=0.97. The Arithmetic Oracle (P180–P192): Discovered that arithmetic suffers the same GSF pattern as facts—correct answers exist at intermediate layers but are suppressed. The def prefix boots a "Code Virtual Machine" for a 4× accuracy boost (P182). Surgery + def + FGA achieves 100% on single-digit addition (P192). Engine Refinement (P183–P189): Established the Orthogonality Principle (zero crosstalk at cos ≈ 0), partial VM bootstrapping without code prefix (P187), and cross-scale autopoiesis using 1.5B as Oracle (P189). The Neural Von Neumann Machine (P196–P199): The transformer implements a Fetch-Decode-Execute-Store pipeline with five identifiable registers (Operand B at L2, Operand A at L11, Carry at L17, Sum at L22, Comparison at L20–L22). OPCODE Register task complexity classified at L1 with 100% accuracy. Pipeline Rewiring (P208): The def prefix shifts the A register from L3→L11 (+8 layers) and Carry from L3→L17 (+14 layers), while B-bus (L2) remains hardwired. The Write Breakthrough (P209): Full hidden-state replacement achieves 83% write success, breaking the Read-Write barrier (0% for all additive methods). Stateless Computation (P207): Registers re-computed per token; the neural CPU "reboots" between autoregressive steps. Previously established in V1–V7: Grammatical Suppression of Facts (GSF): 70% of facts suppressed by final layers; L9H6 (+927) is the top suppressor Code Mode Switch: Any symbol prefix (// # ...) triggers a mode transition reducing GSF DPO Suppression Theorem: DPO suppresses rejected tokens (100% reliability), not promoting correct ones (73%) Inference-Time Paradigm: Dual Surgery + Shield predictive FGA gain for any model scale Cosine Threshold Theorem: Sigmoid transition at cos ≈ 0.69; 100% reliability below cos ≈ 0.5 Phases 1–165 from V1–V7 fully preserved Acknowledgments This research was conducted entirely independently, without institutional affiliation or corporate funding. The author currently faces financial constraints that make it increasingly difficult to maintain subscriptions to AI services essential for this line of research. To sustain and improve the quality of future work, the author is actively seeking community sponsorship. Details are available at https://github.com/sponsors/hafufu-stack.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Hiroto Funasaki (Mon,) studied this question.

synapsesocial.com/papers/6a0bfe2d166b51b53d3796ca https://doi.org/https://doi.org/10.5281/zenodo.20260183

Bookmark

View Full Paper