What question did this study set out to answer?

This research aims to develop a framework for generating realistic yet adversarial scenarios for evaluating autonomous vehicle safety.

March 14, 2026Open Access

Ontology-guided adversarial scenario generation for AV testing

Key Points

This research aims to develop a framework for generating realistic yet adversarial scenarios for evaluating autonomous vehicle safety.
Developed a novel framework for adversarial scenario generation using traffic ontologies
Integrated real-time AV feasibility constraints for kinematic and dynamical admissibility
Employed Dual-Clip Proximal Policy Optimisation for stable learning with sparse rewards
Reduced collision rates to 3.5% and 4.2%, which is an approximate 30% reduction from the best baseline
Improved rule-compliance scores by up to 12%
Demonstrated enhanced interaction diversity among vehicles compared to existing methods

Abstract

Robust evaluation of autonomous vehicle (AV) safety and operational domain demands large-scale traffic scenarios that are simultaneously realistic and adversarial. In this work, we present a novel framework for generating adversarial yet feasible scenarios by explicitly controlling the critical background vehicles (CBVs) whose behaviours are informed by formal traffic ontologies. The proposed pipeline integrates (i) real-time AV feasibility constraints to guarantee kinematic and dynamical admissibility, and (ii) structured semantic knowledge to ensure compliance with traffic rules and social conventions, thereby producing scenarios that are adversarial, physically plausible, and semantically valid. To stabilise learning under sparse and noisy reward signals inherent to adversarial generation, we adopt a Dual-Clip Proximal Policy Optimisation (PPO) scheme with adaptive clipping bounds and curiosity-driven exploration bonuses. Extensive experiments conducted on CARLA Town05 and Town02 intersection benchmarks demonstrate that our CBV policies significantly outperform state-of-the-art baselines, including standard PPO, FPPO-RS, and FREA. Quantitatively, the collision rates are reduced to 3.5% and 4.2%, representing an approximate 30% reduction relative to the best baseline. The rule-compliance scores improve by up to 12%. In addition, OURS produces smoother and more varied interactive behaviors, indicating enhanced interaction diversity compared to baselines. Further cross-policy generalization tests with Expert and PlanT AV controllers confirm consistent improvements in collision avoidance, infeasibility reduction, and overall scenario quality.

Ontology-guided adversarial scenario generation for AV testing

Key Points

Abstract

Cite This Study