Cooperative pursuit–evasion with heterogeneous agents poses a training challenge that flat multi-agent reinforcement learning methods handle poorly: the pursuer team must coordinate internally while competing against adversarial targets, and the two forms of coupling require different learning signals. We present a potential-game-constrained role-structured tracking framework: a centralized training, decentralized execution algorithm for airship-guided unmanned aerial vehicle teams. It decomposes the multi-agent interaction into an internal potential game among pursuers and an external general-sum game against independently controlled targets, and pairs role-structured critics with multi-head attention over heterogeneous agent tokens and a two-stage task-assignment solver embedded as critic conditioning. The simulation results in a three-dimensional environment show that the proposed framework maintains high capture success in multi-target scenarios where standard baselines degrade substantially. A Gazebo-based visual simulation with full rigid-body dynamics confirms that the learned policy transfers to a higher-fidelity simulator after continuation training with a cascaded PID inner-loop controller.
Yang et al. (Sun,) studied this question.