ABSTRACT Task scheduling in cloud manufacturing (CMfg) systems faces significant challenges due to the need to coordinate distributed and heterogeneous resources. While CMfg enables virtualization and service‐oriented collaboration, task competition and dependencies further complicate efficient, real‐time resource allocation. Deep reinforcement learning (DRL) has emerged as a promising solution for CMfg task scheduling; however, existing DRL methods suffer from issues such as sparse rewards and inefficient exploration in high‐dimensional action spaces. To address these challenges, this paper proposes SPIRIT‐PD3QN, a novel DRL approach for hybrid task scheduling. We first construct a multi‐task scheduling model in a CMfg environment and formulate it as a Markov Decision Process (MDP). Building on this model and the MDP framework, we design a DRL scheduling strategy that employs a double‐delayed deep Q‐network architecture, combined with a prioritized experience replay mechanism and a two‐dimensional action space, to improve the stability and generality of scheduling decisions. Furthermore, potential‐based reward shaping and curiosity‐driven exploration are integrated to mitigate the sparse rewards problem and enhance learning efficiency. Numerical experiments demonstrate that our proposed method outperforms mainstream scheduling algorithms in optimizing overall task scheduling performance. Compared with mainstream DRL scheduling approaches, our method achieves competitive results across multiple evaluation metrics.
Pang et al. (Mon,) studied this question.