The coordination of multiple unmanned aerial vehicles traditionally relies on pre-defined control strategies and complex programming implementations, making adaptation to dynamic environments and tasks challenging. The purpose of this study is to explore intent-driven control supported by large language models to address these limitations. The codified objective is to develop a framework capable of interpreting high-level human intent and automatically translating it into executable control instructions for vehicle swarms. As a first approach to the methodology, we present a dual-layer intent-driven cooperative control framework that separates cognitive planning from real-time execution. The design tools include a hierarchical interface, standardized application programming interfaces, retrieval-augmented generation for incorporating domain knowledge, and multimodal prompt engineering to process natural-language instructions and sensor data into Python code. The main findings demonstrate that this framework achieves high code-generation accuracy in typical scenarios, enhances programming efficiency compared to manual methods, and enables adaptive optimization of cooperative strategies through the monitoring of emergent behaviors. In summary, this study contributes an intent-driven solution that simplifies the programming complexity of cooperative swarm control, lowering the technical barrier for deploying advanced autonomous aerial systems.
Li et al. (Sun,) studied this question.