Recent advancements in large language models (LLMs) have demonstrated their strong planning and reasoning capabilities, making them effective autonomous decision-making agents. This paper presents a robust and comprehensive framework for integrating such LLM-based AI agents into the agent-based traffic simulator MATSim. The framework includes LLM servers (local or remote), a multi-step tool calling engine, vector database for retrieval augmented generation (RAG). A verification module evaluates outcomes and provides feedback for unsuccessful scenarios, enabling iterative replanning. Applied to electric vehicle (EV) charging plans, the framework tested six state-of-the-art LLM models using a randomly selected JSON plan and carefully crafted prompts. Results showed that only GPT-4o and GPT-4-turbo achieved a replanning success rate above 50%. The framework demonstrated robustness in a 1% MATSim Montreal urban EV scenario with GPT-4o and an applied random charging scheduler, converging to equilibrium within the provided iterations. AI-generated plans showed higher retention (17.3% vs 12.6%) and selection rates (4% vs 2%) compared to the random scheduler, indicating superior quality.
Patwary et al. (Thu,) studied this question.