What question did this study set out to answer?

This research aims to improve task allocation for a single USV in collecting floating debris using a novel CA-PPO algorithm.

April 25, 2026Open Access

CA-PPO: a cross-attention PPO-based task allocation algorithm for single-USV debris collection

Key Points

This research aims to improve task allocation for a single USV in collecting floating debris using a novel CA-PPO algorithm.
Developed a cleaning task environment incorporating flow prediction for floating debris.
Implemented a cross-attention module to enhance the policy network's targeting capabilities.
Compared the CA-PPO algorithm against traditional and other reinforcement learning methods.
CA-PPO achieved a superior debris collection rate compared to traditional methods.
Outflow rate was minimized, enhancing overall collection efficiency.
Energy efficiency and movement distance efficiency were improved over other algorithms.

Abstract

Abstract With increasing severity of water pollution, intelligent cleaning technologies based on unmanned systems have attracted widespread attention. To address the dynamic characteristics of floating debris on water surfaces, such as random movement and easy outflow, existing reinforcement learning methods face several challenges in practical scheduling tasks, including the inadequate prediction of future states, the weak prioritization of critical targets, and the insufficient consideration of resource constraints. This paper proposes cross-attention-proximal policy optimization (CA-PPO), a task allocation algorithm for a single unmanned surface vehicle (USV), based on the proximal policy optimization (PPO) algorithm. The proposed algorithm constructed a cleaning task environment that incorporated a flow prediction mechanism for floating debris by considering the battery and load limitations of the USV. A cross-attention module enhanced the policy network’s perception of key debris targets. The experimental results demonstrated that the proposed CA-PPO method outperformed traditional heuristic approaches and other reinforcement learning algorithms in terms of debris collection rate, outflow rate, energy efficiency, and movement distance efficiency.

Bookmark

View Full Paper

Cite This Study

Li et al. (Thu,) studied this question.

synapsesocial.com/papers/69ec5bd288ba6daa22dad2b5 https://doi.org/https://doi.org/10.1007/s44295-026-00102-w

Bookmark

View Full Paper