Modern large language models (LLMs) exhibit strong generative capabilitiesbut remain prone to producing fluent yet factually incorrect outputs. A keylimitation of existing approaches is the absence of an explicit representationof internal reasoning dynamics, with generation and evaluation typicallyoccurring within a single probabilistic process. We introduce theSakshi-Protocol, a control-layer architecture that separates generation,observation, and decision-making through an explicit cognitive state-spacerepresentation. This state-space captures interpretable properties of internalbehavior, including stability, reactivity, transformation, valuation, andintegration.We define a distortion metric over this representation to estimateepistemic instability and guide intervention decisions during inference. Wedemonstrate empirically that internal signals are fundamentally insufficientto detect high-confidence hallucinations, establishing a boundary conditionfor this class of approaches. The framework responds by integratinga distortion-guided external grounding mechanism, selectively invokedwhen epistemic risk is elevated. This enables the system to regulate whenverification is required rather than attempting to directly classify correctness.Evaluation demonstrates that distortion-guided control produces consistentseparation between control regions, while selective grounding reduceshallucination rate and preserves baseline accuracy. These results motivatestate-space modeling and distortion-guided intervention as a principledapproach to improving reliability in LLM systems.
Vidyesh N. K. (Mon,) studied this question.