This work introduces a unified causal framework for diagnosing hallucinations in large language models (LLMs) by treating information, measurement, and perception as a continuous physical process. The framework defines information as a physically realizable distinction between system states and models the full causal chain from state to action: STATE → DISTINCTION → COUPLING → MEASUREMENT → FILTERING → REPRESENTATION → ACTION Unlike existing evaluation methods that assess output correctness, this work focuses on identifying where information is lost or distorted before the output is produced. The framework introduces a set of operational metrics: — Information Loss (Lₜotal): quantifies how much relevant information fails to propagate through the system — Distortion (D): measures deviation between generated claims and ground truth — Unsupported Claim Rate (UCR): fraction of claims not supported by evidence — Critical Distinction Loss (CDL): tracks loss of causally relevant distinctions — Decision Cost (Cₜotal): combines computational cost with downstream error impact An extended version includes: — Risk-weighted CDL (RW-CDL) for real-world decision systems — Mutual Information estimation using neural estimators (MINE / InfoNCE) — Knowledge Graph-based extraction of critical distinctions — Markov Decision Process (MDP) modeling for long-term system degradation — A physically grounded interpretation of information processing costs inspired by Landauer’s principle The framework is validated through applied simulations in LLM-based question answering and retrieval-augmented generation (RAG), demonstrating that hallucinations are not random failures but the result of information loss and distortion across the pipeline. This work positions hallucination not as a linguistic issue, but as a systemic failure in information flow. The primary contribution is a causal diagnostic framework that enables identification, quantification, and localization of errors in AI systems. Keywords: information theory, LLM hallucination, causal inference, entropy, complex systems, AI safety, decision systems, information loss, operational intelligence Commercial use of this framework, in whole or in part, including integration into commercial systems, products, or services, is strictly prohibited without explicit written permission from the author.
Heorhii Hohilauri (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: