We identify a universal epistemic computation circuit in large language models: a 5-layer window (layers 13–19) where epistemic states (certain, uncertain, hallucinating, refusing) are causally computed via activation patching across five model families (Llama-3.1-8B, Mistral-7B-v0.2, Qwen2.5-7B, Llama-3.2-3B, Gemma-2-9B). Patching effect drops to exactly zero for all layers ≥ 20, demonstrating a clean separation between a causal computation zone and a downstream read-out point (layer 21). Within the computation zone, MLP sublayers dominate over attention by 7–17× on Llama-3.1-8B, while Mistral-7B-v0.2 shows comparable MLP/Attention contributions — indicating architecture-specific circuit motifs for a universal computation function.
Inna Alieksieienko (Thu,) studied this question.