Large language models can be asked to complete a survey as if they were a person. However, it is unclear how well such model implied responses resemble human data, especially when personas are conditioned on participant information. Human participants completed a questionnaire measuring 91 psychological constructs. For each participant ( n = 177) we generated artificial personas using GPT-4o under two conditions: demographics-only versus demographics plus Big Five scores. We assessed correspondence between human and model implied responses across several metrics: construct means and dispersions, within-pair person-level distances, within-person profile similarity across constructs, and construct-wise human–persona correlations across participants. Adding Big Five information produced small improvements in mean alignment and individual-level similarity, large reductions in under-dispersion relative to humans, modest gains in construct-wise human–persona correspondence, and reduced bias, while RMSE and error variability improved mainly after averaging across multiple persona generations. Overall, model implied responses approximate some aggregate mean patterns across constructs but only modestly reproduce individual differences.
Ana Stojanov (Mon,) studied this question.