February 24, 2018Open Access

Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration

Key Points

Key points are not available for this paper at this time.

Abstract

Reinforcement learning (RL) agents improve through trial-and-error, but when is sparse and the agent cannot discover successful action sequences, stagnates. This has been a notable problem in training deep RL agents perform web-based tasks, such as booking flights or replying to emails, a single mistake can ruin the entire sequence of actions. A common remedy to "warm-start" the agent by pre-training it to mimic expert demonstrations, this is prone to overfitting. Instead, we propose to constrain exploration demonstrations. From each demonstration, we induce high-level "workflows" constrain the allowable actions at each time step to be similar to those the demonstration (e. g. , "Step 1: click on a textbox; Step 2: enter some"). Our exploration policy then learns to identify successful workflows and actions that satisfy these workflows. Workflows prune out bad directions and accelerate the agent's ability to discover rewards. use our approach to train a novel neural policy designed to handle the-structured nature of websites, and evaluate on a suite of web tasks, the recent World of Bits benchmark. We achieve new state-of-the-art, and show that workflow-guided exploration improves sample efficiency behavioral cloning by more than 100x.

Bookmark

View Full Paper

Bookmark

View Full Paper

Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration

Key Points

Abstract

Cite This Study