What question did this study set out to answer?

This research examines how well large language models (LLMs) adapt writing to different communicative contexts compared to human second-language learners.

July 5, 2026

Form Without Function? Register Sensitivity in CEFR‐Conditioned LLM and Second‐Language Learner Writing

Key Points

This research examines how well large language models (LLMs) adapt writing to different communicative contexts compared to human second-language learners.
Compared LLM-generated writing to human writing across informal, semi-formal, and formal categories.
Analyzed outputs at CEFR levels A2 to C1 using Biber's multi-dimensional analysis (MDA).
Used Write & Improve prompts with 592 human and 4,724 LLM samples for evaluation.
LLM outputs showed reduced variability compared to human writing, particularly in formal-argumentative tasks.
At C1 CEFR level, LLMs demonstrated the least register flexibility, clustering near the center of the human distribution.
Limited human-like differentiation in MDA space suggests LLMs exhibit form without function.

Abstract

ABSTRACT Large language models (LLMs) can generate text that resembles L2 learner writing at requested proficiency levels, but whether such texts adapt to communicative contexts as human learners do remains unclear. This study compared LLM‐generated and human second‐language writing across three task‐defined categories (informal–personal, semi‐formal–expository, formal–argumentative) and four CEFR levels (A2–C1) using Biber's multi‐dimensional analysis (MDA). Six instruction‐tuned models responded to Write n = 4724 LLM), with EFCAMDAT as a human baseline. Under CEFR‐only prompting, LLM outputs shifted on MDA dimensions tied to informational density and explicitness, but showed weaker, differently directed, or less variable differentiation on dimensions tied to persuasion and elaboration, especially in formal–argumentative tasks. The largest observed gap was at C1, where CEFR descriptors most emphasize register flexibility. LLM outputs were more uniform than human writing and clustered near the center of the human distributional space, showing less variation than authentic learner texts. The broad pattern recurred across all six model families. CEFR‐conditioned generation can reproduce some structural signals of proficiency while showing limited human‐like register differentiation in MDA space, a pattern we interpret as form without function.

اسأل الذكاء الاصطناعي

Bookmark

اسأل الذكاء الاصطناعي

Bookmark

Form Without Function? Register Sensitivity in CEFR‐Conditioned LLM and Second‐Language Learner Writing

Key Points

Abstract

Cite This Study