What question did this study set out to answer?

The study aims to evaluate large language models' (LLMs) understanding of human language and their cognitive capabilities.

May 16, 2026Open Access

A sentence is worth a thousand pictures: can large language models understand hum4n L4ngu4ge and the W0rld behind W0rds?

Key Points

The study aims to evaluate large language models' (LLMs) understanding of human language and their cognitive capabilities.
Analyzed LLMs as cognitive representations versus mechanistic tools.
Evaluated models' performance in decoding a leet task (l33t t4sk) involving number-letter replacements.
Hypothesized that LLMs lack grounded cognition, relying on fixed word associations.
Humans excelled in the leet task while models struggled.
Findings suggest LLMs' limitations in processing beyond fixed associations.
Key cognitive abilities necessary for improved model performance are still missing.

Abstract

The current generation of large language models (LLMs) has been linked to claims about human-like linguistic performance, and their applications are hailed both as a step towards artificial general intelligence and as a major advance in understanding the cognitive and even neural basis of human language. To assess these claims, first, we analysed the contribution of LLMs as theoretically informative representations of a target cognitive system versus atheoretical mechanistic tools. Second, we evaluated the models' ability to see the bigger picture through top-down feedback from higher levels of processing, which requires grounding in previous expectations and past world experience. We hypothesize that since models lack grounded cognition, they cannot take advantage of these features and instead solely rely on fixed associations between represented words and word vectors. To assess this, we ran a novel leet task (l33t t4sk), which requires decoding sentences in which letters are systematically replaced by numbers. In line with our hypothesis, the results suggest that humans excel in this task, whereas models struggle. We interpret these results by identifying the key abilities that are still missing from the current state of development of these models, which require solutions that go beyond increased system scaling. This article is part of the theme issue 'World models in natural and artificial intelligence'.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Leivada et al. (Thu,) studied this question.

synapsesocial.com/papers/6a080a9fa487c87a6a40c929 https://doi.org/https://doi.org/10.1098/rsta.2025.0008

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Bookmark

View Full Paper