This article presents a safe reinforcement learning (RL) framework for nonlinear multiagent systems (MASs) based on min-max distributed model predictive control (DMPC). The proposed method employs min-max DMPC as a robust baseline to generate control strategies, optimizing closed-loop performance in the presence of disturbances while ensuring interpretability and safety. To mitigate the conservatism inherent in traditional robust DMPC due to its reliance on precise models and fixed disturbance bounds, safe RL is introduced to adaptively update the controller parameters and disturbance sets online. The proposed parameter update mechanism of safe RL formally guarantees the recursive feasibility of the DMPC algorithm during the learning process. Furthermore, theoretical analyses of closed-loop stability are provided. The effectiveness and scalability of the proposed method are validated through two simulation examples.
Peng et al. (Thu,) studied this question.