High-fidelity simulation testing is a critical component in ensuring the safety and reliability of autonomous driving systems. However, traditional methods for constructing simulation scenarios face two major bottlenecks. First, acquiring realistic road network topologies that adhere to physical and traffic rules is expensive. Second, the manual placement of scenario elements (e.g., vehicles and pedestrians) is a time-consuming and labor-intensive process, which struggles to meet the demands of large-scale and diverse testing. To address these challenges, this paper proposes an efficient and automated simulation scenario generation method and toolchain. The proposed approach begins by extracting road network topologies from real-world data sources (e.g., open map datasets) and then uses specialized tools, such as RoadRunner, to automatically assign traffic semantics and rules. The key innovation lies in leveraging the powerful image-text understanding capabilities of large multimodal models (LMMs) to analyze road network images and textual descriptions, generating a semantic heatmap that represents the spatial distribution probabilities of scenario elements. This heatmap guides the procedural content generation (PCG) process, enabling the intelligent and scalable deployment of traffic participants. Experimental results demonstrate that the proposed method can efficiently generate large-scale, high-fidelity, and cost-effective simulation scenarios. The generated scenarios not only maintain realism in topology and traffic rules but also feature rich perception and interaction capabilities. Furthermore, based on this method, we have constructed and released a novel simulation dataset tailored for training perception algorithms, further validating the practical value and advancement of the toolchain.
Li et al. (Thu,) studied this question.