While reinforcement learning has significant applications in smart manufacturing, effectively coordinating multiple agents with limited communication remains a significant challenge. In this paper, we propose a localized multi-agent reinforcement learning approach specifically designed for serial supply chains. We formulate the supply chain management problem as a Markov decision process and uncover the exponential decay property of the Formula: see text-functions, which allows each agent to approximate the Formula: see text-functions using local observations and communications. Then, we propose the scalable natural actor–critic (SNAC) algorithm to solve the problem. The SNAC algorithm leverages localized coordination and reduces reliance on global information, thus addressing the challenges of large-scale and dynamic supply chain environments. Additionally, we conduct numerical experiments to demonstrate the effectiveness of SNAC in managing serial supply chains.
Building similarity graph...
Analyzing shared references across papers
Loading...
Rongjinzi Wang
Ruiyang Jin
Jie Song
Asia Pacific Journal of Operational Research
Peking University
City University of Hong Kong
Building similarity graph...
Analyzing shared references across papers
Loading...
Wang et al. (Fri,) studied this question.
www.synapsesocial.com/papers/68c1c62f54b1d3bfb60f1c8c — DOI: https://doi.org/10.1142/s021759592540010x