What question did this study set out to answer?

The aim is to create LLM agents that accurately simulate individual participants in collaborative problem solving scenarios.

April 15, 2026Open Access

Modeling individual participants as LLM agents in collaborative problem solving simulations

Key Points

The aim is to create LLM agents that accurately simulate individual participants in collaborative problem solving scenarios.
Developed LLM agents fine-tuned on dialogue data to mimic participant speaking styles.
Performed simulated dialogues reflecting turn-taking and thematic patterns.
Applied Epistemic Network Analysis to assess simulation quality against real dialogues.
Used covariance-based distance measures and significance testing for validation.
Simulated data preserved structural features of the original CPS dataset with an ENA distance of 0.17.
Simulation achieved statistically indistinguishable results when compared with observed dialogues (p-value 0.65).
Framework offers a new approach for CPS research using participant-specific LLM fine-tuning.

Abstract

This study introduces a new approach to simulating collaborative problem solving (CPS) by training large language model (LLM) agents to represent individual participants. Each agent is fine-tuned on dialogue data to capture the speaking style and thematic patterns of real participants, enabling naturalistic reproduction of CPS interactions. The agents then engage in simulated dialogues, generating conversations that reflect both turn-taking dynamics and thematic code trajectories. To evaluate the quality of these simulations, we apply Epistemic Network Analysis (ENA) to compare the structure of observed and simulated dialogues. Validation is based on covariance-based distance measures and significance testing, showing that the simulated data closely preserves the statistical and structural features of the original CPS dataset. Specifically, the simulated adjacency vectors achieve an ENA distance of 0.17 from the empirical network — well within the 95th percentile threshold of the null distribution — with a permutation test -value of 0.65, confirming that simulated and real dialogues are statistically indistinguishable. To our knowledge, this is the first framework to combine participant-specific parameter-efficient LLM fine-tuning with ENA-based structural validation, offering a principled pathway for extending CPS research beyond the constraints of scarce observational data. Simulating each participant through an LLM agent allows us to reproduce realistic interaction patterns.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Zheng Fang

Journals

Computers and Education Artificial Intelligence

Actions

Institutions

Monash University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Modeling individual participants as LLM agents in collaborative problem solving simulations

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study