What question did this study set out to answer?

To categorize and analyze advancements in vision-and-language navigation and their implications for artificial intelligence.

March 23, 2026Open Access

A comprehensive review of recent advancements in vision-and-language navigation

Key Points

To categorize and analyze advancements in vision-and-language navigation and their implications for artificial intelligence.
Conducted a comprehensive literature review.
Categorized literature into frameworks, models, auxiliaries, and tasks.
Analyzed trends, strengths, and limitations across different studies.
Consolidated simulation environments and datasets for review.
Identified key challenges in vision-and-language navigation.
Highlighted notable methodologies and innovations in the field.
Provided comparative analysis of various models and frameworks.

Abstract

Visual Language Navigation (VLN) is a rapidly evolving, cross-disciplinary field that has witnessed a phenomenal rise since its inception in recent past. Leveraging the power of deep learning, it aims to endow auditory and visual perceptions to embodied agents, thereby enabling them to interpret the multi-modal cues. Although the end objective remains solitary, the various ways this problem is approached in the larger context of artificial intelligence differs substantially in scope. Consequently, various contributions, ranging from architectural innovations and novel models, improved training strategies, datasets, methodologies, etc. have been proposed in the literature. Therefore, this work aims to provide a comprehensive review of scientific contributions and advancements by categorizing the literature into four primary abstraction layers: frameworks, models, auxiliaries, and tasks. Within each abstraction layer, the literature is further sub-categorized and then analyzed for notable trends, strengths, weaknesses and limitations as well as cross-compared for respective advantages and limitations. The simulation environments and datasets have been consolidated and unique insights are presented into each along with a summary of their advantages and disadvantages. Through extensive analysis, crucial challenges that remain in VLN are identified, guiding future efforts toward more intelligent and adaptable navigation agents.

Read Full Paperexternally

Perguntar à IA

Bookmark

View Full Paper

Cite This Study

Khan et al. (Sat,) studied this question.

synapsesocial.com/papers/69c08bcaa48f6b84677f9add https://doi.org/https://doi.org/10.1007/s10791-026-09977-z

Perguntar à IA

Bookmark

View Full Paper