Key points are not available for this paper at this time.
Meta reinforcement learning (meta-RL) extracts knowledge from previous tasks achieves fast adaptation to new tasks. Despite recent progress, efficient in meta-RL remains a key challenge in sparse-reward tasks, as it quickly finding informative task-relevant experiences in both-training and adaptation. To address this challenge, we explicitly model an policy learning problem for meta-RL, which is separated from policy learning, and introduce a novel empowerment-driven objective, which aims to maximize information gain for task. We derive a corresponding intrinsic reward and develop a new-policy meta-RL framework, which efficiently learns separate context-aware and exploitation policies by sharing the knowledge of task. Experimental evaluation shows that our meta-RL method significantly state-of-the-art baselines on various sparse-reward MuJoCo tasks and more complex sparse-reward Meta-World tasks.
Zhang et al. (Mon,) studied this question.