What question did this study set out to answer?

The study aims to improve path-planning methods for mobile robots navigating complex unknown environments using deep reinforcement learning techniques.

March 27, 2026Open Access

Path-planning method of mobile robot in complex unknown environment based on deep reinforcement learning

Key Points

The study aims to improve path-planning methods for mobile robots navigating complex unknown environments using deep reinforcement learning techniques.
Applied Deep Q-Network (DQN) reinforcement learning theory for path planning.
Used 2.5D maps to represent the state space.
Developed a heuristic reward function to optimize path characteristics.
Conducted comparative analysis with 2D and 3D map models.
Achieved a 20.3% reduction in path length compared to the 3D map model.
Reduced energy consumption by 23.67% and increased safety by 30.13%.
Showed a 9.2% improvement in mission success rate with integrated heuristic reward.
Path length decreased by 38.57% with improved energy efficiency and safety.

Abstract

Abstract In practical applications, mobile robots are frequently required to operate in challenging environments, including post-disaster scenarios such as earthquakes and floods, as well as complex terrains like polar regions, deserts, and construction sites, where obstacles are often deformable. The combination of uneven terrain and unpredictable obstacles poses significant challenges to the robot’s energy efficiency, safety, and operational effectiveness. This paper addresses these challenges by applying Deep Q-Network (DQN) reinforcement learning theory to the path-planning of mobile robots in unknown environments, proposing the “Deep Q-Network Unknown Complex Environment Path Planning” (DUCP) method. The DUCP method uses 2.5D maps for the model’s state space and incorporates a comprehensive heuristic reward function to guide the learning process, optimizing path length, energy consumption, and safety. Experimental results demonstrate that the proposed method enables robots to identify shorter, more energy-efficient, and safer paths in unknown environments. A comparative analysis reveals that the 2.5D map model outperforms both 2D and 3D map models, achieving a higher task success rate while reducing path length by 20.3%, energy consumption by 23.67%, and increasing safety by 30.13% compared to the 3D map model. Additionally, the study examines the impact of the integrated heuristic reward on model performance, showing that incorporating it improves the mission success rate by 9.2%, reduces path length by 38.57%, decreases energy consumption by 51.68%, and enhances safety by 51.67%.

Mark Helpful

Bookmark

Relay

View Full Paper