April 19, 2024Open Access

Model-based Adversarial Imitation Learning with Self-adatpive Error Control

Key Points

Key points are not available for this paper at this time.

Abstract

Abstract Generative Adversarial Imitation Learning (GAIL) presents the ability to learn policies without prior knowledge of the underlying reward function. However, it often suffers from limited sample efficiency due to its reliance on reinforcement learning for policy learning, mandating extensive real-time interactions with the environment. To address this challenge, this paper introduces a refined framework named TM-GAIL, which combines transition function model learning with GAIL. This approach capitalizes on the utility of neural networks to construct a transition function model, facilitating the generation of virtual samples to complement real data. The training of the discriminator is augmented by the inclusion of virtual samples alongside expert demonstration data. In the context of policy learning, the incorporation of virtual samples, real samples, and the reward derived from the discriminator enriches the policy learning. Furthermore, a self-adaptive error control module has been meticulously designed for the regions characterized by high returns and to mitigate model errors. Empirical findings demonstrate that TM-GAIL significantly improves sample efficiency in comparison to imitation learning and model-free methods. It achieves performance levels that closely align with those of domain experts across both continuous and discrete tasks.

Read Full Paperexternally

AIに質問

Bookmark

View Full Paper