May 10, 2010

Optimal policy switching algorithms for reinforcement learning

Key Points

Key points are not available for this paper at this time.

Abstract

We address the problem of single-agent, autonomous sequential decision making. We assume that some controllers or behavior policies are given as prior knowledge, and the task of the agent is to learn how to switch between these policies. We formulate the problem using the framework of reinforcement learning and options (Sutton, Precup Precup, 2000). We derive gradient-based algorithms for learning the termination conditions of options, with the goal of optimizing the expected long-term return. We incorporate the proposed approach into policy-gradient methods with linear function approximation.

Optimal policy switching algorithms for reinforcement learning

Key Points

Abstract

Cite This Study

Also Consider

Also Consider