May 10, 2019Open Access

Memory Bounded Open-Loop Planning in Large POMDPs using Thompson Sampling

Key Points

Key points are not available for this paper at this time.

Abstract

State-of-the-art approaches to partially observable planning like POMCP are on stochastic tree search. While these approaches are computationally, they may still construct search trees of considerable size, which limit the performance due to restricted memory resources. In this paper, propose Partially Observable Stacked Thompson Sampling (POSTS), a memory approach to open-loop planning in large POMDPs, which optimizes a fixed stack of Thompson Sampling bandits. We empirically evaluate POSTS in four benchmark problems and compare its performance with different tree-based. We show that POSTS achieves competitive performance compared to-based open-loop planning and offers a performance-memory tradeoff, making suitable for partially observable planning with highly restricted and memory resources.

Memory Bounded Open-Loop Planning in Large POMDPs using Thompson Sampling

Key Points

Abstract

Cite This Study