In recent years,embodied intelligence has emerged as a critical interdisciplinary domain integrating multimodal perception,large model reasoning,and intelligent decision-making,demonstrating tremendous potential to expand the boundaries of intelligence and empower problem-solving in the real world. However,due to the complexity of embodied manipulations and the diversity of task scenarios,embodied intelligence research faces severe challenges,including difficulties in high-quality data collection,large embodied model construction,and training and inference optimization. These challenges hinder the development of embodied intelligence toward larger scales,stronger generality,and broader applications. This paper first introduces the relationship between general large models and large embodied models,and discusses the key challenges of embodied intelligence in three aspects:“data-model-optimization”. Then,this paper systematically reviews the core technologies of embodied intelligence,covering three main threads—embodied data and simulation,embodied “cerebellum-cerebrum” models,and embodied training and inference—with a focus on analyzing the trends in technological development. Finally,the paper discusses open challenges and future directions,including scalable data collection via integrated virtual-real collection,general embodied reasoning and manipulation,few-shot adaptation and real-time decision-making,aiming to promote technological breakthroughs and real-world applications of embodied intelligence.
Wang et al. (Sun,) studied this question.