Key points are not available for this paper at this time.
This paper presents a novel comperative value iteration (VI)-based adaptive dynamic programming method for multi-player differential game models with a convergence proof. The players are divided into two groups in the learning process and adapt their policies sequentially. Our method removes the dependence of admissible initial policies, which is one of the main drawbacks of the PI-based frameworks. Furthermore, this algorithm enables the players to adapt their control policies without full knowledge of others' system parameters or control laws. The efficacy of our method is illustrated by three examples.
Building similarity graph...
Analyzing shared references across papers
Loading...
Yun Zhang
Shanghai Jiao Tong University
Lulu Zhang
Shanghai Jiao Tong University
Yunze Cai
Ministry of Education of the People's Republic of China
IEEE/CAA Journal of Automatica Sinica
Shanghai Jiao Tong University
Ministry of Education of the People's Republic of China
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhang et al. (Mon,) studied this question.
synapsesocial.com/papers/68e79836b6db64358770846f — DOI: https://doi.org/10.1109/jas.2023.124125