Underwater vehicles performing sampling tasks often encounter significant buoyancy variations due to payload adjustments and environmental changes, which severely challenge the stability and accuracy of controllers. To address this issue, this paper proposes a hybrid control framework that integrates Proximal Policy Optimization (PPO) with adaptive PID tuning. The framework employs PPO to dynamically adjust PID parameters online while incorporating output saturation, stepwise quantization, and dead zone filtering to ensure control safety and actuator longevity. A dual-error state representation—combining instantaneous error and its derivative—along with actuator command buffering is introduced to compensate for system lag and inertia. Comparative simulations and experimental tests demonstrate that the proposed method achieves faster convergence, lower steady-state error, and smoother control signals compared to both conventional PID and pure PPO-based control. The framework is validated through pool tests and field trials, confirming its robustness under realistic hydrodynamic disturbances. This work provides a practical and safe solution for adaptive depth control of sampling-capable AUVs operating in dynamic underwater environments.
Wang et al. (Sat,) studied this question.