May 10, 2010

Optimal policy switching algorithms for reinforcement learning

Key Points

Key points are not available for this paper at this time.

Abstract

We address the problem of single-agent, autonomous sequential decision making. We assume that some controllers or behavior policies are given as prior knowledge, and the task of the agent is to learn how to switch between these policies. We formulate the problem using the framework of reinforcement learning and options (Sutton, Precup Precup, 2000). We derive gradient-based algorithms for learning the termination conditions of options, with the goal of optimizing the expected long-term return. We incorporate the proposed approach into policy-gradient methods with linear function approximation.

Perguntar à IA

Bookmark

Cite This Study

Comanici et al. (Mon,) studied this question.

synapsesocial.com/papers/6a206635a6bc523684956cd2 https://doi.org/https://doi.org/10.5555/1838206.1838300

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Perguntar à IA

Bookmark