Large-scale multi-agent systems face two core challenges: inefficient policy learning and the explosion of state dimensions. Existing methods often rely on manually designed task sequences to guide agents’ learning in stages, but these designs lack adaptability to agents’ learning abilities, making it difficult to ensure the rationality of task difficulty. Moreover, the representation capability of current network structures is limited, making it challenging to efficiently handle high-dimensional state information and complex interaction relationships. To address these issues, we propose a Progressive Multi-Agent Reinforcement Learning (PMARL) framework. PMARL introduces a task adapter that adaptively selects task difficulty based on agents’ learning abilities, eliminating reliance on manual experience. Additionally, a Dynamic Dimension Adaptive Network (DDAN) is designed, incorporating hypernetwork and self-attention mechanisms to achieve adaptive feature extraction of high-dimensional states and efficient representation of agent interaction relationships. Experimental results demonstrate that PMARL exhibits higher efficiency and better adaptability compared to existing methods when addressing large-scale multi-agent tasks.
Fang et al. (Sat,) studied this question.