Deep reinforcement learning (DRL) for autonomous mobile robot navigation faces several inherent limitations. The stochastic nature of actions generated by DRL policies can undermine performance consistency, while inefficient exploration frequently delays the learning process or prevents the discovery of optimal solutions. This research aims to enhance the robustness of path planning by addressing these challenges. To achieve this goal, we propose a hybrid approach that integrates the flexible decision-making capabilities of deep reinforcement learning with the stability of traditional path planning. The proposed model adopts the Twin Delayed Deep Deterministic Policy Gradient (TD3) network as its base. Notably, we pre-process LiDAR point cloud data to extract only essential features for the state representation, thereby preventing performance degradation from high-dimensional inputs and improving computational efficiency. Our model optimizes the learning process through two core strategies. First, it prioritizes experience data generated during training based on negative rewards, guiding the model to learn more frequently from critical failures rather than redundant successes. Second, it dynamically compares the action proposed by the TD3 network with a goal-oriented action from a classical path-planning algorithm in real time. By selecting the action with the higher estimated value, the model guides the policy toward a stable and effective trajectory from the earliest stages of training. To validate the efficacy of our approach, we conducted simulation-based experiments comparing the performance of the proposed model with existing reinforcement learning networks. To ensure statistical significance and mitigate the impact of random initialization, all reported results are averaged over 10 independent runs with different random seeds. The results quantitatively demonstrate that our model achieves significantly higher and more stable reward values, confirming a robust improvement in the path-planning process.
강대열 et al. (Fri,) studied this question.