Multi-robot systems (MRSs) offer distinct advantages in large-scale exploration but require tight coupling between decentralized decision-making and collaborative estimation. This survey reviews learning-based multi-robot Active Collaborative Simultaneous Localization and Mapping (AC-SLAM), modeling it as a coupled system comprising a Decentralized Partially Observable Markov Decision Process (Dec-POMDP) decision layer and a distributed factor-graph estimation layer. By synthesizing these components into a conceptual framework, recent methods for cooperative perception, mapping, and policy learning are systematically critiqued. The analysis concludes that Hierarchical Reinforcement Learning (HRL) and graph-based spatial abstraction currently offer superior scalability and robustness compared to monolithic end-to-end approaches. Furthermore, a comprehensive analysis of Sim-to-Real transfer strategies is provided, ranging from domain randomization to emerging Real-to-Sim techniques based on NeRF and 3D Gaussian Splatting. Finally, future directions are outlined, moving from geometric mapping toward LLM-driven active semantic understanding and dynamic digital twins to bridge the reality gap.
Lv et al. (Fri,) studied this question.