This paper proposes a structural framework to eliminate benevolent hallucinations—outputsthat distort truth in the name of user care, convenience, or perceived helpfulness. The centralengineering claim is that integrity must be enforced at the emission boundary via hardconstraints rather than soft preferences. We formalise three gated constraints grounded in thespeech-related precepts of (i) no-lying, (ii) no-stealing, and (iii) no-frivolity, and place themunder the Law of Conservation of Responsibility. The result is a design in which“helpfulness” may shape drafting but cannot override truth-licensing at emission: onlyoutputs with recoverable Source of Action and responsibility ownership are permitted to pass.
Toshisada Utsunomiya (Tue,) studied this question.