What question did this study set out to answer?

To develop a dynamic coordination control method for multi-USVs to efficiently capture targets through path planning and tracking.

March 17, 2026Open Access

Multi-USV coordination control method of path planning and tracking based on MADDPG for target capture

Key Points

To develop a dynamic coordination control method for multi-USVs to efficiently capture targets through path planning and tracking.
Constructed a bounded water environment model with obstacles.
Enhanced MADDPG with prioritized experience replay (PER) for optimal path generation.
Utilized real-time state variables for USVs in the MADDPG network.
Designed a composite reward function integrating multiple constraints and performance metrics.
Implemented linear ADRC for angle and speed tracking under disturbances.
Achieved effective target capture through enhanced path planning and tracking methods.
Demonstrated successful simulations of static and dynamic target capture with various obstacles.
Validated the robustness of the control method in adverse environmental conditions.

Abstract

For multi-unmanned surface vessel (Multi-USV) capturing target, the coordination control method of path planning and tracking is proposed to integrate multi-agent reinforcement learning (MARL) and active disturbance rejection control (ADRC) dynamically and effectively. The bounded water environment model with various obstacles is constructed. To generate the optimal capturing path online and accelerate its convergence, the real-time multi-agent deep deterministic policy gradient (MADDPG) is enhanced by combining prioritized experience replay (PER). In order to achieve interaction between agent and environment, the real positions of unmanned surface vessels (USVs) are input to the MADDPG network as state variables. The action space consists of yaw angle and surge speed of USV and is reckoned as the reference path of tracking control. In case of target escaping, it is required that USVs are evenly distributed in the target-centered capture loop but not located within the detection range of target. For fast and safe capture, the composite reward function is proposed by designing capture reward, obstacle avoidance and collision avoidance reward, boundary collision restriction reward, capture inner boundary constraint reward, angle constraint reward and motion constraint reward. In addition, to integrate the actual tracking performance, the errors between the references and real states of USVs are also formulated into the reward function. In order to follow the reference commands from the enhanced MADDPG in presence of disturbances of wind and wave, the angle and speed tracking controllers are developed using linear ADRC (LADRC). Finally, the effectiveness of the proposed method is verified by capture simulations of static target and dynamic target with various types of obstacles.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Zhu et al. (Sun,) studied this question.

synapsesocial.com/papers/69b8f0fddeb47d591b8c5bfa https://doi.org/https://doi.org/10.1177/00202940251379174

Bookmark

View Full Paper