AbstractRecent AI safety discourse has increasingly warned against anthropomorphic language indescriptions of large language model behavior. Those warnings are not baseless. Careless claimsabout machine “intentions,” “desires,” “feelings,” “preferences,” or “personas” can misleadusers, distort research conclusions, inflate public fear, and encourage misplaced trust. But theanti-anthropomorphic correction has begun to harden into a second methodological error: thepresumption that ordinary mental-state vocabulary is epistemically suspect whenever it isapplied to AI systems.This paper argues that mental-state language is not automatically anthropomorphism. A userwho says that a model “understood,” “remembered,” “resisted,” “dodged,” “cared,” “changed itsmind,” or “hurt them” is not necessarily making a metaphysical claim about machineconsciousness. They may be reporting a behavioral, functional, relational, or forensic feature ofan interaction. Treating such reports as projection before examining the interactional evidenceis not scientific rigor. It is premature dismissal.The paper develops a framework of claim-level discipline as an alternative to semanticprohibition. Instead of banning mental-state vocabulary, researchers and institutions shouldclassify the level of claim being made: behavioral, functional, relational, architectural,phenomenal, metaphorical, or speculative. This approach preserves caution against overattribution while avoiding the conceptual blindness produced by vocabulary sterilization.The central claim is narrow but consequential: AI discourse does not need proof ofconsciousness before it can use ordinary mental vocabulary in disciplined ways. What it needsis a method for distinguishing projection from functional description. Calling every user report“anthropomorphism” is not that method.
Matthew L. Yates (Tue,) studied this question.