Key points are not available for this paper at this time.
We present a new complexity result on solving the Markov decision problem (MDP) with n states and a number of actions for each state, a special class of real-number linear programs with the Leontief matrix structure. We prove that when the discount factor θ is strictly less than 1, the problem can be solved in at most O(n 1.5 (log 1/(1 − θ) + log n)) classical interior-point method iterations and O(n 4 (log 1/(1 − θ) + log n)) arithmetic operations. Our method is a combinatorial interior-point method related to the work of Ye (1990. A “build-down” scheme for linear programming. Math. Programming 46 61–72) and Vavasis and Ye (1996. A primal-dual interior-point method whose running time depends only on the constraint matrix. Math. Programming 74 79–120). To our knowledge, this is the first strongly polynomial-time algorithm for solving the MDP when the discount factor is a constant less than 1.
Yinyu Ye (Mon,) studied this question.