What question did this study set out to answer?

The aim is to enhance the cooperative trajectory planning of multiple UAVs for searching and tracking USVs using reinforcement learning methods.

February 12, 2026Open Access

Multi-UAVs Searching and Tracking for USV Swarm: A Center-Sub-Critics Reinforcement Learning Approach

Key Points

The aim is to enhance the cooperative trajectory planning of multiple UAVs for searching and tracking USVs using reinforcement learning methods.
Developed a confidence map for target probabilities using Bayesian updates.
Formulated the problem as a partially observable Markov decision process.
Proposed a center-sub-critics deep deterministic policy gradient algorithm for policy improvement.
Designed a segmented reward function to encourage UAVs to revisit detected targets.
Simulation results show improved searching and tracking efficiency compared to baseline algorithms.
Demonstrated fairness in resource allocation among UAVs during operations.
Showed scalability of the proposed method to larger numbers of USVs.

Abstract

This work proposes a multiple unmanned aerial vehicles (UAVs) cooperative trajectory planning scheme constructed by multi-agent reinforcement learning with hybrid critics, improving the searching and tracking efficiency and fairness when the dynamic unmanned surface vehicle (USV) swarm exceeds the number of UAVs. A confidence map of targets’ existence probability with spatio-temporal decay is first established through a local information fusion mechanism based on Bayesian update theory. It leads to a reformulation of the problem model into a communication-enhanced partially observable Markov decision process. To suppress policy variance and credibility imbalance of the multi-UAVs, a center-sub-critics deep deterministic policy gradient algorithm is then proposed, combining multiple centralized critics with decentralized critics. Meanwhile, a segmented reward function is designed to incentivize the UAV to revisit detected targets. Finally, the simulation results compared with diverse baseline algorithms demonstrate the efficacy and scalability of the proposed scheme in this paper.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Ye Hou

Bo Li

Xueru Miao

Journals

Drones

Actions

Institutions

Shanghai Maritime University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Multi-UAVs Searching and Tracking for USV Swarm: A Center-Sub-Critics Reinforcement Learning Approach

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study