What type of study is this?

This is a Experimental Study study.

October 31, 2025Open Access

Empowering Aerial Maneuver Games Through Model-Based Constrained Reinforcement Learning

Puntos clave

Agent displays superior zero-shot performance with higher sample efficiency compared to model-free baselines, and rapid fine-tuning was achieved under competitive conditions.
The learning framework incorporates a population-based self-play pipeline and curriculum initialization to enhance strategic development without prior expert input.
Using a Lipschitz regularizer, the approach constrains value error while optimizing policy, ensuring stability and performance during training.
In simulations, the model effectively navigates stochastic dynamics, balancing aggression and survival in air combat environments.

Resumen

Achieving full autonomy in Within-Visual-Range air combat with a single, end-to-end learning policy is a formidable challenge, where agents must navigate stochastic dynamics and sparse rewards to master the delicate trade-off between aggression and survival. We introduce a Model-Based Reinforcement Learning agent that combines the Dreamer framework with safety-aware objectives to tackle this. To enhance learning stability and foresight in this demanding domain, we augment Dreamers WM with an Information Noise-Contrastive Estimation loss for long-range dependencies, categorical predictors to robustly model outcomes, Dyna-style actor-critic updates to ground the policy, and a Lipschitz regularizer to constrain value error. Furthermore, our framework integrates a population-based self-play pipeline with curriculum initialization, enabling rapid strategic discovery without expert priors. To validate our approach, we conducted evaluations in a high-fidelity 6-Degree-of-Freedom simulation, where our agent demonstrated superior zero-shot performance, significantly higher sample efficiency than model-free baselines, and rapid fine-tuning against novel opponents, highlighting a viable path toward deployable autonomous agents.

Leer artículo completoexternamente

Preguntar a la IA

Me gusta

Guardar

Ver artículo completo