What does this research mean for the field?

LLM-generated personas can approximate aggregate mean patterns of human survey responses across psychological constructs, but they only modestly reproduce individual differences, even when conditioned on Big Five personality traits. Novelty: ClaimNovelty.INCREMENTAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

The aim is to evaluate how well responses generated by large language models (LLMs) match human survey data based on participant characteristics.

June 3, 2026Open Access

Do participant-matched LLM personas approximate human survey data?

Key Points

The aim is to evaluate how well responses generated by large language models (LLMs) match human survey data based on participant characteristics.
177 human participants completed a psychological questionnaire measuring 91 constructs.
Artificial personas were generated using GPT-4o under two conditions: demographics-only and demographics plus Big Five scores.
Correspondence between human and model responses was assessed using various metrics including alignments and correlations.
Adding Big Five scores improved mean alignment and individual-level similarity compared to using demographics only.
Model responses exhibited reduced under-dispersion and modest gains in construct-wise correspondence with human data.
Overall alignment patterns were observed, but individual differences were only modestly reproduced.

Abstract

Large language models can be asked to complete a survey as if they were a person. However, it is unclear how well such model implied responses resemble human data, especially when personas are conditioned on participant information. Human participants completed a questionnaire measuring 91 psychological constructs. For each participant ( n = 177) we generated artificial personas using GPT-4o under two conditions: demographics-only versus demographics plus Big Five scores. We assessed correspondence between human and model implied responses across several metrics: construct means and dispersions, within-pair person-level distances, within-person profile similarity across constructs, and construct-wise human–persona correlations across participants. Adding Big Five information produced small improvements in mean alignment and individual-level similarity, large reductions in under-dispersion relative to humans, modest gains in construct-wise human–persona correspondence, and reduced bias, while RMSE and error variability improved mainly after averaging across multiple persona generations. Overall, model implied responses approximate some aggregate mean patterns across constructs but only modestly reproduce individual differences.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Ana Stojanov (Mon,) studied this question.

synapsesocial.com/papers/6a1fc4bbdee9eb8c0dce637b https://doi.org/https://doi.org/10.1016/j.paid.2026.113915

Bookmark

View Full Paper