Abstract A diffuse but powerful confusion has settled over the contemporary discourse on artificial intelligence: the assumption that a sufficiently large language model is, or is becoming, a mind. This paper argues that the assumption is a category error. A large language model is a stateless renderer of text – a powerful one, but a renderer nonetheless. It has no biography, no persistent self-model, no internal modulation, and no mechanism by which experience consolidates into structure. These are not deficiencies to be solved by scale; they are properties of an entirely different class of system. Cognition – to the extent the term means anything when applied to machines – does not reside in the weights of the model. It resides, if anywhere, in the architecture that surrounds the model: in persistent sub-symbolic substrates, in memory consolidation mechanisms, in modulated affect, in self-models that endure across sessions, and in the safe pathways by which a system modifies itself. The paper develops this agent-layer thesis, distinguishes it from related proposals (tool use, retrieval-augmented generation, scaffolded prompting, and existing cognitive architectures for language agents), and presents a working architecture in which the language model is treated explicitly as voice rather than mind, as one worked example of the thesis in practice.
Winston Duncan (Tue,) studied this question.