We introduce streaming epistemic geometry — the first token-by-token tracking of epistemic subspace projections during autoregressive generation in large language models. Using PCA-based subspace analysis on five independently trained model families (Llama-3.1-8B, Mistral-7B, Gemma-2-9B, Qwen2.5-7B, Llama-3.2-3B; 4 organisations, 3B–9B parameters), we show that hallucination, refusal, and certainty each produce a distinct dynamic signature in the residual stream detectable from the very first generated token. A logistic classifier trained on the first-token projection score achieves leave-one-out AUC = 0.991 on Llama-3.1-8B and transfers zero-shot to TruthfulQA. Our geometric detector and an output-entropy baseline capture complementary failure modes: the subspace method flags factual-citation errors while entropy flags physically improbable myths. All code and data included for full reproducibility.
Inna Alieksieienko (Sat,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: