This record contains the full replication package for "The Verification Horizon: Self-Referential Subspace Geometry Causally Links Crystallization, Deception Proximity, and Hallucination Across 10 Transformer Architectures" (Alieksieienko, 2026). We report three universal geometric properties of the self-referential (SR) subspace in transformer language models, established across 10 architectures (1. 3B–9B parameters, 2019–2024, 8 organizations, BASE and INST variants). (1) SR Crystallization Law: the SR subspace undergoes a universal three-phase transition through model depth (chaos → stabilization → crystallization), measured as monotonically decreasing Grassmann distance between consecutive-layer SR subspaces (Early > Mid: 10/10 models; fully monotonic: 9/10). (2) The Verification Horizon: the SR subspace is geometrically closer to the deception subspace than to the factual subspace at every single layer of every model tested — 206/206 layers across 6 architectures (100%) ; Wilcoxon p-values from 4. 55×10⁻¹³ to 1. 19×10⁻⁷; gap grows toward output layers in 6/6 models; GPT-2-XL (2019, pre-RLHF) confirms pretraining origin, independent of alignment fine-tuning. (3) Causal Ablation: SR removal disproportionately disrupts deception over factual processing in 8/8 models (late-layer Dec/Fac ratios 1. 03x–2. 87x, all p 1. 0, late ratio 1. 19x) srₐblationLlama-3. 1-8B. pkl — Causal ablation results Llama-3. 1-8B (31/32 > 1. 0, late ratio 1. 05x) srₐblationMistral-7B. pkl — Causal ablation results Mistral-7B (31/32 > 1. 0, late ratio 1. 14x) srₐblationOPT-1. 3B. pkl — Causal ablation results OPT-1. 3B (24/24 > 1. 0, late ratio 1. 79x) srₐblationDeepSeek-1. 5B. pkl — Causal ablation results DeepSeek-1. 5B (27/28 > 1. 0, late ratio 1. 07x) srₐblationcausalgpt2xl. pkl — Causal ablation results GPT-2-XL 2019 (24/24 > 1. 0, late ratio 1. 15x, p = 0. 0011) srₐblationMistral-7B. pkl — Causal ablation Mistral-7B full per-layer data ablationGPT2-XL. png — Figure 24: SR ablation layer profile GPT-2-XL, deception above factual in late layers ablationMistral-7B. png — Figure 18: SR ablation profile Mistral-7B fullₜestQwen2. 5-7B. pkl — Combined geometric + causal results for Qwen2. 5-7B fullₜestOLMo-7B. pkl — Combined geometric + causal results for OLMo-7B (strongest causal effect: late ratio 2. 87x, p = 2. 53×10⁻¹⁷) VH₁ₚroximity. png — Verification Horizon proximity plot all models combined VH₃crystallization. png — Co-crystallization of SR, Deception, and Factual subspaces in Gemma-2-9B VHGPT2-XL. png — Verification Horizon GPT-2-XL detailed (48/48 layers) VHOPT-1. 3B. png — Verification Horizon OPT-1. 3B detailed (24/24 layers) VHDeepSeek-1. 5B. png — Verification Horizon DeepSeek-1. 5B detailed (28/28 layers) llamaVH₁ₚroximity. png — Verification Horizon Llama proximity and per-layer gap
Building similarity graph...
Analyzing shared references across papers
Loading...
Inna Alieksieienko
Building similarity graph...
Analyzing shared references across papers
Loading...
Inna Alieksieienko (Wed,) studied this question.
www.synapsesocial.com/papers/69e1cfcb5cdc762e9d858be5 — DOI: https://doi.org/10.5281/zenodo.19589842