Los puntos clave no están disponibles para este artículo en este momento.
Defending against adversarial unmanned aerial vehicle (UAV) swarms presents a critical challenge for modern security systems, requiring coordinated strategies under partial observability. Existing multi-agent reinforcement learning (MARL) and spatio-temporal graph neural networks (ST-GCN) methods typically treat agents as homogeneous nodes, failing to effectively model large-scale, dynamic spatio-temporal dependencies to support cooperative decision-making, and lacking semantic interpretation of adversarial tactics. To address this, we propose the Spatio-Temporal Attention Graph-Enhanced policy optimization (STAGE). Unlike ST-GCN that uses fixed or predefined graph structures, STAGE models the diverse swarm topology through a learnable dynamic adjacency matrix and a multi-hop neighborhood aggregation mechanism to capture dependencies at different ranges. Moreover, to overcome the black-box nature of existing adversarial analysis, we design a dual-path intent recognition framework, which concatenates cluster-level features from graph attention networks with computed tactical metric vectors, and trains a classifier for four types of swarm tactical intents, enabling explicit recognition of enemy intent by the defenders. The policy is optimized end-to-end within the MAPPO framework through a structured multi-objective reward function. Extensive experiments in multi-scale area defense scenarios demonstrate that STAGE outperforms state-of-the-art methods in task performance and tactical comprehension robustness.
Su et al. (Mon,) studied this question.