July 15, 2025Open Access

Integrating Large Language Models into Robotic Autonomy: A Review of Motion, Voice, and Training Pipelines

Key Points

Large language models significantly improve robotic autonomy in multiple areas like motion and voice interaction.
TrustNavGPT reduces the word error rate to 5.7% for voice commands, enhancing navigation under noisy conditions.
The integration of multi-modal data in frameworks like MapGPT supports robust planning and real-time execution.
Best practices identified in this review aim to bridge the gap between simulation training and real-world robotic deployment.

Abstract

This survey provides a comprehensive review of the integration of large language models (LLMs) into autonomous robotic systems, organized around four key pillars: locomotion, navigation, manipulation, and voice-based interaction. We examine how LLMs enhance robotic autonomy by translating high-level natural language commands into low-level control signals, supporting semantic planning and enabling adaptive execution. Systems like SayTap improve gait stability through LLM-generated contact patterns, while TrustNavGPT achieves a 5.7% word error rate (WER) under noisy voice-guided conditions by modeling user uncertainty. Frameworks such as MapGPT, LLM-Planner, and 3D-LOTUS++ integrate multi-modal data—including vision, speech, and proprioception—for robust planning and real-time recovery. We also highlight the use of physics-informed neural networks (PINNs) to model object deformation and support precision in contact-rich manipulation tasks. To bridge the gap between simulation and real-world deployment, we synthesize best practices from benchmark datasets (e.g., RH20T, Open X-Embodiment) and training pipelines designed for one-shot imitation learning and cross-embodiment generalization. Additionally, we analyze deployment trade-offs across cloud, edge, and hybrid architectures, emphasizing latency, scalability, and privacy. The survey concludes with a multi-dimensional taxonomy and cross-domain synthesis, offering design insights and future directions for building intelligent, human-aligned robotic systems powered by LLMs.

Read Full Paperexternally

KI fragen

Bookmark

View Full Paper

Cite This Study

Liu et al. (Tue,) studied this question.

synapsesocial.com/papers/689a02bce6551bb0af8cc7bd https://doi.org/https://doi.org/10.3390/ai6070158

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

KI fragen

Bookmark

View Full Paper