Current large language models suffer from hallucinations because memory retrieval and language generation are the same operation within transformer parameters. This leads to two compounding problems: knowledge-class errors cannot be eliminated at the architectural level, and significant computation is wasted generating statistically plausible but factually incorrect content. We propose a new architecture inspired by the Complementary Learning Systems of the human brain, in which a hippocampal module handles factual storage and retrieval, and a neocortical module handles language generation and reasoning. The neocortical module is structurally prevented from generating factual content without input from the hippocampal module, making knowledge-class hallucinations architecturally impossible rather than statistically reduced. This separation also eliminates the computational waste associated with confabulation.
Y S Wang (Mon,) studied this question.