What question did this study set out to answer?

The aim is to create a video generative model that synthesizes driving videos based on ego-vehicle actions.

March 14, 2026Open Access

EVGen: Trajectory-conditioned forward-view video generation under minimal visual observations

Key Points

The aim is to create a video generative model that synthesizes driving videos based on ego-vehicle actions.
Developed EVGen for generating front-view driving videos based on planned trajectories.
Introduced modules for context extraction in both temporal and spatial domains.
Designed an attention module to improve information integration across frames.
EVGen outperformed existing methods like Neural Radiance Fields in driving video generation.
The new modules enhanced the model's overall performance compared to leading models.
Demonstrated ability for on-demand video synthesis with minimal visual observations.

Abstract

Scalable simulation with real-world data is critical to the development of autonomous driving due to its convenience and practicality in training and testing algorithms. Thus, research on generating high-fidelity and consistent driving videos, especially those involving view transformations based on ego-vehicle action controls, has attracted growing interest. However, existing methods, like Neural Radiance Fields and 3D Gaussian Splatting, often lack generalization capability and require extensive inputs. Furthermore, 2D generative models can generate various views, yet still have potential in improving consistency and realism. To address these limitations, we propose EVGen, a novel video generative model that synthesizes front-view videos of vehicles conditioned on a set of planned trajectories. A new module that extracts contexts from neighboring pixels in both temporal and spatial domains is presented to improve the consistency of the synthesis. Additionally, we design an innovative attention module that integrates information both within individual frames and across a corresponding region of the reference frame. Extensive experiments demonstrate that our method outperforms several leading models in front-view driving video generation, and the proposed modules can enhance the model's performance. This work presents a new paradigm for goal-oriented video synthesis with minimal observation, enabling on-demand generation to accelerate algorithm development.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Beike Yu

Dafang Wang

Journals

Journal of King Saud University - Computer and Information Sciences

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

EVGen: Trajectory-conditioned forward-view video generation under minimal visual observations

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study