This commentary discusses Lu, Song, and Zhang's (2025, Nature Human Behaviour) finding that the language of interaction systematically shifts the cultural tendencies expressed by large language models (LLMs). Across a battery of cultural-psychology tasks administered without any explicit cultural framing, both GPT and ERNIE produced more interdependent and more holistic responses when prompted in Chinese than in English, with the effect extending even to a non-verbal pictorial measure. Drawing on subsequent mechanistic work suggesting that linguistic and cultural associations may be co-represented in shared computational units, and on emerging evidence that language-based priming is unstable across models and tasks, I argue that what appears to be cultural cognition in LLMs is better characterized as a pattern of outputs shaped by training data and design choices than as a stable cultural identity. I conclude that the research agenda should expand from asking which human culture LLMs reproduce to measuring, explaining, and governing an emergent machine culture.
Yueqing Hu (Thu,) studied this question.