What question did this study set out to answer?

This research aims to analyze the distinctions of UAV Embodied AI compared to other forms of embodied AI. It seeks to elucidate the unique challenges and opportunities in this field.

May 13, 2026Open Access

Embodied AI in the Sky: A Comparative Review of UAV Embodied AI, from Autonomous Remote Sensing to Task Execution

Key Points

This research aims to analyze the distinctions of UAV Embodied AI compared to other forms of embodied AI. It seeks to elucidate the unique challenges and opportunities in this field.
Introduced a comparative framework contrasting UAV-EAI with Indoor-EAI and AD-EAI.
Analyzed core tasks including perception, localization, and exploration.
Categorized modeling methods encompassing physics-centric control to cognition-centric models.
Identified core distinctions in motion space and operational environments for UAVs versus ground agents.
Demonstrated how existing VLM/LLM-based systems face challenges adapting to UAV context.
Summarized open challenges and outlined future research opportunities for advancing UAV-EAI.

Abstract

Unmanned Aerial Vehicle (UAV), particularly rotary-wing platforms such as quadcopters and octocopters, has evolved from controlled remote sensing platforms into autonomous agents capable of active task execution. This evolution from collect-then-analyze workflows to closed-loop perception, reasoning, and action signifies a paradigm shift toward Embodied AI, unlocking opportunities for the low-altitude economy. However, current research on UAV Embodied AI (UAV-EAI) often implicitly frames the field as a direct extension of indoor robotics or autonomous driving, which overlooks the fundamental distinctions of aerial agents. To bridge this gap, we introduce a comparative framework contrasting UAV-EAI with Indoor-EAI and Autonomous Driving Embodied AI (AD-EAI). By systematically decomposing the domain into nine key dimensions, we (i) analyze core tasks such as perception, localization, and exploration; (ii) review enabling infrastructure, including simulators and datasets; and (iii) categorize modeling methods ranging from physics-centric control to cognition-centric models. Our analysis demonstrates that the convergence of 6-DoF motion space, kilometer-scale unstructured environments, and stringent on-device constraints establishes a research regime qualitatively different from ground-based agents. These factors significantly impede the migration of existing VLM/LLM-based embodied systems for UAVs. Finally, we summarize open challenges and outline promising directions for the next generation of UAV-EAI.

Embodied AI in the Sky: A Comparative Review of UAV Embodied AI, from Autonomous Remote Sensing to Task Execution

Key Points

Abstract

Cite This Study