Abstract Recent advancements in Large Language Models (LLMs) offer powerful capabilities in common sense reasoning and planning, making them a promising tool for Object Goal Navigation (ObjectNav). However, existing LLM-based approaches face two significant challenges: the high computational cost of LLM inference, which limits real-time decision making, and a domain gap between the LLMs’ general-purpose knowledge and the specific demands of navigation scenarios. To overcome these challenges, we propose a Knowledge-Enhanced navigation framework with an Intuitive-Deliberate mechanism (KEID). KEID employs an Intuitive-Deliberate mechanism that mimics human cognition, using a lightweight intuition module to strategically invoke the LLM, which reduces computational overhead. Meanwhile, KEID enhances the LLM with two specialized knowledge bases: a Scene Description Tree that describes the complex spatial and semantic relationships of indoor environments within a hierarchical framework and a Navigation Example database for in-context learning adaptation. Evaluations on the HM3D dataset within the Habitat simulator validate our method’s efficacy, demonstrating that KEID achieves a 47.1% success rate and a competitive 18.8% success weighted by path length, significantly outperforming existing baselines. Our work not only improves navigation performance but also enhances decision-making efficiency, establishing an effective framework for developing practical, real-time LLM-based robotic agents.
Yang et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: