January 1, 2003

OnActor-Critic Algorithms

Key Points

Key points are not available for this paper at this time.

Abstract

In this article, we propose and analyze a class of actor-critic algorithms. These are two-time-scale algorithms in which the critic uses temporal difference learning with a linearly parameterized approximation architecture, and the actor is updated in an approximate gradient direction, based on information provided by the critic. We show that the features for the critic should ideally span a subspace prescribed by the choice of parameterization of the actor. We study actor-critic algorithms for Markov decision processes with Polish state and action spaces. We state and prove two results regarding their convergence.

AI에게 질문

Bookmark

Cite This Study

Konda et al. (Wed,) studied this question.

synapsesocial.com/papers/6a0db5109a2918c675a4f99d https://doi.org/https://doi.org/10.1137/s0363012901385691

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

AI에게 질문

Bookmark