Communicating the design and results of agent-based models (ABMs) to subject matter experts is challenging, which hinders participation and limits trust in simulation-based decision support. Large language models (LLMs) can communicate ABMs as textual summaries, thus complementing traditional disclosure through statistical and visualization techniques. While prior work translated the structure of conceptual models into narratives via LLMs, our extension covers the dynamics of simulation models via an automated simulation-to-text method that extracts contextual information from NetLogo ABMs, performs repeated simulations, and generates narrative descriptions (including the model’s purpose, parameters, and simulation dynamics) using mutimodal LLMs. Furthermore, four summarization algorithms spanning abstractive and extractive methods provide shorter reports. Using Design-of-Experiments methods over three peer-reviewed ABMs, state-of-the-art multimodal LLMs from 2026 (Gemini 3.1 Pro, Qwen 3.5, Kimi K2.5, Claude Opus 4.6) and different prompt elements (e.g., roles, examples, generating insights, statistical analyses), we compare our results with several reference reports (e.g., from associate professors). We find that report quality is determined mainly (i.e., up to 34% of the variance) by the summarization algorithm and its interaction with the LLM, with abstractive summarizers (BART, T5) producing more coherent and readable reports, while Claude Opus 4.6 is the most robust LLM.
Flandre et al. (Wed,) studied this question.