Los puntos clave no están disponibles para este artículo en este momento.
In this paper we show that for a finite Markov decision process an average optimal policy can be found by solving only one linear programming problem. Also the relation between the set of feasible solutions of the linear program and the set of stationary policies is analyzed.
Hordijk et al. (Sun,) studied this question.