Unmanned aerial vehicles (UAVs) have been extensively deployed in various emergency rescue missions due to their cost-effectiveness, high maneuverability, and flexible flying capabilities. However, challenges such as densely packed towering buildings, no-fly zones, and complex urban environments significantly complicate rescue operations. To ensure equipment integrity and reliable data transmission, it is essential to implement an efficient coverage path planning strategy for the UAV swarm while prioritizing their safe landing. This paper addresses the problem of multi-UAV area coverage path planning in complex post-disaster urban scenarios, which is formulated as a maximum coverage rate optimization problem subject to safe landing rate requirements. Due to the uncertainties in the environment and UAV dynamics, a partially observable Markov decision process is adopted to model the flight status of UAVs. Accordingly, a deep reinforcement learning based collaborative strategy is developed, where collaborative rewards are determined through a coverage matrix. The approach prioritizes identifying the shortest coverage path while maximizing rewards, enabling real-time collaborative decision-making in UAV swarms. Comprehensive simulations are conducted, and the results show that the proposed method is effective for controlling UAV swarms. Furthermore, these results highlight the strong generalization capability of the proposed algorithm in dynamic environments.
Gao et al. (Mon,) studied this question.