What question did this study set out to answer?

The research aims to develop a robust control algorithm for helicopter landings on ships, addressing challenges from model uncertainty and environmental disturbances.

June 4, 2026Open Access

A Residual PPO Method for Shipboard Helicopter Landing Control

Key Points

The research aims to develop a robust control algorithm for helicopter landings on ships, addressing challenges from model uncertainty and environmental disturbances.
Developed a hybrid control algorithm integrating a model-based controller and residual reinforcement learning.
Utilized a split-channel incremental nonlinear dynamic inversion outer loop and a reduced-order dynamic inversion inner loop for the baseline control pathway.
Performed simulations comparing the proposed Residual PPO against baseline controllers and Pure PPO.
Residual PPO improved hover robustness and landing performance compared to the baseline controller and Pure PPO.
Achieved approximately 20–30% residual authority with a 90.0% desired landing rate across various descent-and-landing scenarios.

Abstract

Shipboard helicopter landing in the near-deck region requires stable attitude regulation and high-precision deck-relative motion control under substantial model uncertainty and environmental disturbances, conditions under which conventional model-based controllers may lose performance or become overly conservative. This paper proposes a task-oriented, learning-enhanced control algorithm for ship-relative near-deck station keeping and landing by integrating a model-based baseline controller with residual reinforcement learning in a deck-relative closed-loop framework. The algorithmic contribution is the deck-relative baseline–residual control architecture: a split-channel incremental nonlinear dynamic inversion (INDI) outer loop and a reduced-order dynamic inversion (DI) inner loop provide the nominal baseline pathway, while a bounded residual Proximal Policy Optimization (PPO) policy supplies compensation in the same physical outer-loop command channels to suppress unmodeled nonlinearities and time-varying disturbances. Simulation results show that Residual PPO improves hover robustness and landing performance relative to the baseline controller and Pure PPO. With approximately 20–30% residual authority, it achieved 90.0% Desired landing rates in both tested descent-and-landing scenes.

A Residual PPO Method for Shipboard Helicopter Landing Control

Key Points

Abstract

Cite This Study