What question did this study set out to answer?

The aim is to assess the effectiveness of generative AI in logistics process simulation and modeling.

April 1, 2026Open Access

Exploring Artificial Intelligence as a Tool for Logistics Process Simulation

Key Points

The aim is to assess the effectiveness of generative AI in logistics process simulation and modeling.
Evaluated LLMs, Perplexity and ChatGPT, for discrete-event simulation in ExtendSim
Modeled a complex manufacturing system yielding 9721 tons of output
Assessed three scenarios: autonomous model creation, output estimation, and copilot-guided building
Measured estimation errors and development time reduction in the copilot approach
Output estimation achieved errors of 0.1% for Perplexity and 1.2% to 22.8% for ChatGPT after prompt refinement
The copilot approach reduced model development time from several days to 8–10 hours
Around 28% (Perplexity) and 32% (ChatGPT) of errors were identified as hallucinations
Model verification still required significant human expertise to address logical flaws

Abstract

The growing integration of generative artificial intelligence in logistics demands efficient simulation modeling. This study evaluates generative large language models, Perplexity and ChatGPT, for discrete-event simulation in ExtendSim. It focuses on modeling a real, complex manufacturing system, yielding 9721 tons of output. The following three scenarios were assessed: autonomous model creation, output estimation from process descriptions and parameters, and copilot-guided manual building. LLMs cannot autonomously construct ExtendSim models due to the lack of APIs. Output estimation only matched benchmarks after iterative prompt refinement, achieving errors of 0.1% for Perplexity and 1.2% to 22.8% for ChatGPT. Estimation without substantial human intervention proved infeasible. Only the copilot approach appeared viable despite initial errors. It enabled a validated model with 9718 tons output after resolving 25 errors for Perplexity and 22 for ChatGPT through iterative refinement. Approximately 28% (Perplexity) or 32% (ChatGPT) of the errors were hallucinations. The copilot approach reduced development time from several days to 8–10 h. Human expertise remained essential for verifying model outputs and addressing hallucinations and logical flaws. Consequently, this approach may be less feasible for inexperienced users. The copilot paradigm offers practical acceleration for experienced users; however, its limitations underscore the need for API integration and retrieval-augmented generation enhancements.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Straka et al. (Sun,) studied this question.

synapsesocial.com/papers/69ccb66716edfba7beb880c8 https://doi.org/https://doi.org/10.3390/app16073301

Bookmark

View Full Paper