What question did this study set out to answer?

To evaluate the simulation-to-reality gap in automated driving perception system tests using a novel metric G.

February 2, 2026Open Access

Evaluation of the Simulation-to-Reality Gap in Novel X-in-the-Loop Test Methods for Automated Driving Perception Systems

Key Points

To evaluate the simulation-to-reality gap in automated driving perception system tests using a novel metric G.
Introduced a metric G for quantifying the simulation-to-reality gap.
Conducted test drives in real and simulated environments with an integrated sensor-in-the-loop setup.
Executed extensive tests, including static and dynamic scenarios under various weather conditions.
G remained predominantly positive, indicating a bias in simulation-driven outcomes.
Over-the-air stimulation narrowed the gap compared to direct-data-injection.
Found that virtual testing is reliable under daytime but requires better models for adverse weather.

Abstract

Reliable simulation-based methods are essential for the verification and validation of automated driving. To meet this need, G is introduced – a metric that quantifies the simulation-to-reality gap by comparing outcomes of environment-perception system tests executed in real and simulation-based test drives. To execute the simulation-based test drives, a novel sensor-in-the-loop test environment – combining over-the-air (OTA) and direct-data-injection (DDI) stimulation for camera and radar sensors – was integrated, enabling an extensive test campaign including static and Euro-NCAP-inspired dynamic scenarios under daytime, night, and rain conditions. For the empirical analysis of G, test drives were conducted on a real proving ground and reproduced in the laboratory within the sensor-in-the-loop environment. Test drives in both real and virtual domains generated more than two hours of data, which have been published as an open dataset for the scientific community. The results comprise a direct comparison of outcomes from real-world drives and their virtual counterparts. Across the investigated scenarios, G remained predominantly positive, confirming a pessimistic bias of simulation-based test drives. OTA consistently narrowed the gap relative to DDI, substantiating the benefits of full sensor integration. The findings indicate that while tests under daytime conditions may justifiably migrate to the virtual domain, while accurate simulation-based validation under adverse weather still demands the development of improved weather-disturbance models – otherwise real-world drives remain necessary. By consolidating established perception metrics into a single, regulator-oriented indicator, the proposed metric G establishes a foundation for demonstrating simulation fidelity and aligns directly with emerging German and EU requirements for approval workflows for automated driving.

Evaluation of the Simulation-to-Reality Gap in Novel X-in-the-Loop Test Methods for Automated Driving Perception Systems

Key Points

Abstract

Cite This Study