September 23, 2025Open Access

Fast trajectory planner with a reinforcement learning-based controller for robotic manipulators

Key Points

Key points are not available for this paper at this time.

Abstract

Generating obstacle-free trajectories for robotic manipulators in unstructured and cluttered environments remains a significant challenge. Existing motion planning methods often require additional computational effort to generate the final trajectory by solving kinematic or dynamic equations. This paper highlights the strong potential of model-free reinforcement learning methods over model-based approaches for obstacle-free trajectory planning in joint space. We propose a fast trajectory planning system for manipulators that combines vision-based path planning in task space with reinforcement learning-based obstacle avoidance in joint space. We divide the framework into two key components. The first introduces an innovative vision-based trajectory planner in task space, leveraging the large-scale fast segment anything (FSA) model in conjunction with basis spline (B-spline)-optimized kinodynamic path searching. The second component enhances the proximal policy optimization (PPO) algorithm by integrating action ensembles (AE) and policy feedback (PF), which greatly improve precision and stability in goal-reaching and obstacle avoidance within joint space. These proximal policy optimization (PPO) enhancements increase the algorithm’s adaptability across diverse robotic tasks, ensuring consistent execution of commands from the first component by the manipulator, while also enhancing both obstacle avoidance efficiency and reaching accuracy. The experimental results demonstrated the effectiveness of proximal policy optimization (PPO) enhancements, as well as simulation-to-simulation (Sim-to-Sim) and simulation-to-reality (Sim-to-Real) transfer, in improving model robustness and planner efficiency in complex scenarios. These enhancements allowed the robot to perform obstacle avoidance and real-time trajectory planning in obstructed environments. https://sites.google.com/view/ftp4rm/home • We develop a novel systematic approach for the obstacle free trajectory planning in the Cartesian space of the end-effector. This approach integrates perception, kinodynamic path searching, and basis spline (B-spline)-optimized trajectory optimization, ensuring safety and dynamic feasibility for manipulators. • We introduce a novel algorithm based on model-free reinforcement learning (RL) that effectively addresses reaching tasks with obstacle avoidance for a 6-DoF manipulator in complex environments. • The propsoed method is adaptable to various manipulators and shows enhanced performance compared to existing reinforcement learning (RL) algorithms. • We conducted extensive assessments to validate the proposed planner’s efficiency and robustness.

Bookmark

View Full Paper