Key points are not available for this paper at this time.
Visual Reinforcement Learning (RL) is a promising approach to achieve human-like intelligence. However, it currently faces challenges in learning efficiently within noisy environments. In contrast, humans can quickly identify task-relevant objects in distraction-filled surroundings by applying previously acquired common knowledge. Recently, foundational models in natural language processing and computer vision have achieved remarkable successes, and the common knowledge within these models can significantly benefit downstream task training. Inspired by these achievements, we aim to incorporate common knowledge from foundational models into visual RL. We propose a novel Focus-Then-Decide (FTD) framework, allowing the agent to make decisions based solely on task-relevant objects. To achieve this, we introduce an attention mechanism to select task-relevant objects from the object set returned by a foundational segmentation model, and only use the task-relevant objects for the subsequent training of the decision module. Additionally, we specifically employed two generic self-supervised objectives to facilitate the rapid learning of this attention mechanism. Experimental results on challenging tasks based on DeepMind Control Suite and Franka Emika Robotics demonstrate that our method can quickly and accurately pinpoint objects of interest in noisy environments. Consequently, it achieves a significant performance improvement over current state-of-the-art algorithms. Project Page: https://www.lamda.nju.edu.cn/chenc/FTD.html Code: https://github.com/LAMDA-RL/FTD
Building similarity graph...
Analyzing shared references across papers
Loading...
Chao Chen
Jiacheng Xu
Weijian Liao
Nanjing University
Tencent (China)
Building similarity graph...
Analyzing shared references across papers
Loading...
Chen et al. (Sun,) studied this question.
www.synapsesocial.com/papers/68e72a6ab6db6435876a3fd6 — DOI: https://doi.org/10.1609/aaai.v38i10.29002
Synapse has enriched 3 closely related papers on similar clinical questions. Consider them for comparative context: