What type of study is this?

September 10, 2025Open Access

Time-optimal online trajectory planning for excavator robotics based on reinforcement learning

Key Points

The proposed algorithm improves task trajectory efficiency by reducing completion times across multiple comparisons.
Training time is reduced by up to 40% compared to other reinforcement learning algorithms like DDPG and TRPO.
A simulation environment generates training data, enhancing data interaction between the learning algorithm and system states.
The method ensures stable operation by minimizing impacts on joint movements within excavator robotics.

Abstract

To achieve efficient and stable autonomous operation during excavator robotic trimming plane operations, this study proposes an online trajectory planning method based on deep reinforcement learning (RL) using the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm. This method involves the construction of a simulation environment to generate training data, where the joint angles of the boom, arm, and bucket of the excavator robotic working device serve as state observation variables, and the angle changes of each joint constitute the action information. The interaction between the simulation environment and the autonomous learning algorithm is facilitated by these state observations, and the policy network is trained using a reward function. Under identical experimental conditions, the proposed algorithm exhibits higher training time compared with other RL algorithms designed for continuous action spaces. Specifically, the training time of the proposed algorithm is reduced by 24.81%, 40.29%, and 34.51% compared with those of the DDPG, traditional TD3, and TRPO algorithms, respectively. In addition, the time of the proposed algorithm compared with the DDPG algorithm, TRPO algorithm, and traditional TD3 algorithm, the time required to complete a given task is reduced by 1.807, 3.703, and 5.011 s, respectively. These results indicate that the proposed optimization algorithm offers improved efficiency and faster convergence than the DDPG, traditional TD3, and TRPO algorithms, ultimately generating an efficient task trajectory. Moreover, the method effectively minimizes the large impacts on each joint, ensuring that the excavator robotic system operates with high efficiency and stability.

Read Full Paperexternally

KI fragen

Bookmark

View Full Paper

Cite This Study

Zhang et al. (Fri,) studied this question.

synapsesocial.com/papers/68c1dda254b1d3bfb60fc4e2 https://doi.org/https://doi.org/10.1177/16878132251370408

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

KI fragen

Bookmark

View Full Paper