Key points are not available for this paper at this time.
This article presents a novel approach for controlling a fleet of drones that can track the location of a flying target using onboard omnidirectional cameras. The drones use Multi-Agent Reinforcement Learning (MARL) to learn decentralized policies that optimize their formation and motion around the target, minimizing the uncertainty in the triangulated position. We design a reward function that encourages the trackers to minimize the trace of the covariance matrix of the triangulated position, which is derived from an analytical model of uncertainty propagation. We use Multi-Agent PPO (MAPPO), an extension of Proximal Policy Optimization (PPO) to the multi-agent setting, to train the policies using this common reward function that encourages good formation and avoids collisions. We validate our approach in simulation and real-flight experiments, demonstrating its effectiveness and potential in enhancing autonomous multi-drone coordination for precise tracking.
Gavin et al. (Tue,) studied this question.