Deep reinforcement learning (DRL) has shown great potential in path planning tasks. However, in sparse reward environments, DRL still faces significant challenges such as low training efficiency and a tendency to converge to suboptimal policies. Traditional reward shaping methods can partially alleviate these issues, but they typically rely on hand-crafted designs, which often introduce complex reward coupling, make hyperparameter tuning difficult, and limit generalization capability. To address these challenges, this paper proposes Curriculum-guided Learning with Adaptive Reward Shaping for Deep Q-Network (CLARS-DQN), a path planning algorithm that integrates Adaptive Reward Shaping (ARS) and Curriculum Learning (CL). The algorithm consists of two key components: (1) ARS-DQN, which augments the DQN framework with a learnable intrinsic reward function to reduce reward sparsity and dependence on expert knowledge; and (2) a curriculum strategy that guides policy optimization through a staged training process, progressing from simple to complex tasks to enhance generalization. Training also incorporates Prioritized Experience Replay (PER) to improve sample efficiency and training stability. CLARS-DQN outperforms baseline methods in task success rate, path quality, training efficiency, and hyperparameter robustness. In unseen environments, the method improves task success rate and average path length by 12% and 26%, respectively, demonstrating strong generalization. Ablation studies confirm the critical contribution of each module.
Zhang et al. (Wed,) studied this question.