This study introduces a new approach to simulating collaborative problem solving (CPS) by training large language model (LLM) agents to represent individual participants. Each agent is fine-tuned on dialogue data to capture the speaking style and thematic patterns of real participants, enabling naturalistic reproduction of CPS interactions. The agents then engage in simulated dialogues, generating conversations that reflect both turn-taking dynamics and thematic code trajectories. To evaluate the quality of these simulations, we apply Epistemic Network Analysis (ENA) to compare the structure of observed and simulated dialogues. Validation is based on covariance-based distance measures and significance testing, showing that the simulated data closely preserves the statistical and structural features of the original CPS dataset. Specifically, the simulated adjacency vectors achieve an ENA distance of 0.17 from the empirical network — well within the 95th percentile threshold of the null distribution — with a permutation test -value of 0.65, confirming that simulated and real dialogues are statistically indistinguishable. To our knowledge, this is the first framework to combine participant-specific parameter-efficient LLM fine-tuning with ENA-based structural validation, offering a principled pathway for extending CPS research beyond the constraints of scarce observational data. Simulating each participant through an LLM agent allows us to reproduce realistic interaction patterns.
Building similarity graph...
Analyzing shared references across papers
Loading...
Zheng Fang
Computers and Education Artificial Intelligence
Monash University
Building similarity graph...
Analyzing shared references across papers
Loading...
Zheng Fang (Wed,) studied this question.
www.synapsesocial.com/papers/69df2a4be4eeef8a2a6af829 — DOI: https://doi.org/10.1016/j.caeai.2026.100593