July 7, 2024Open Access

Provably Efficient Long-Horizon Exploration in Monte Carlo Tree Search through State Occupancy Regularization

Key Points

Key points are not available for this paper at this time.

Abstract

Monte Carlo tree search (MCTS) has been successful in a variety of domains, but faces challenges with long-horizon exploration when compared to sampling-based motion planning algorithms like Rapidly-Exploring Random Trees. To address these limitations of MCTS, we derive a tree search algorithm based on policy optimization with state occupancy measure regularization, which we call Volume-MCTS. We show that count-based exploration and sampling-based motion planning can be derived as approximate solutions to this state occupancy measure regularized objective. We test our method on several robot navigation problems, and find that Volume-MCTS outperforms AlphaZero and displays significantly better long-horizon exploration properties.

Read Full Paperexternally

اسأل الذكاء الاصطناعي

Bookmark

View Full Paper

Cite This Study

Schramm et al. (Sun,) studied this question.

synapsesocial.com/papers/68e6128fb6db6435875a52f6 https://doi.org/https://doi.org/10.48550/arxiv.2407.05511

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

اسأل الذكاء الاصطناعي

Bookmark

View Full Paper