Los puntos clave no están disponibles para este artículo en este momento.
Abstract This article investigates the policy iteration (PI) method for the discounted optimal control (DOC) problem of continuous‐time linear systems. We show the properties and convergence of the PI method. The theory analysis shows that the convergence of PI can be ensured without requiring the initial admissible control gain. The convergence rate of the PI method is provided. An iteration‐termination criterion is established for detecting the stability of the closed‐loop system under the control gain obtained by executing PI. Two kinds of data‐driven implementations are constructed without using prior information of the system dynamics. A simulation example is presented to validated the properties of the PI method.
Dong et al. (Thu,) studied this question.