January 1, 2006

Autonomous shaping

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

We introduce the use of learned shaping rewards in reinforcement learning tasks, where an agent uses prior experience on a sequence of tasks to learn a portable predictor that estimates intermediate rewards, resulting in accelerated learning in later tasks that are related but distinct. Such agents can be trained on a sequence of relatively easy tasks in order to develop a more informative measure of reward that can be transferred to improve performance on more difficult tasks without requiring a hand coded shaping function. We use a rod positioning task to show that this significantly improves performance even after a very brief training period.

Me gusta

Guardar

Cite This Study

Konidaris et al. (Sun,) studied this question.

synapsesocial.com/papers/6a1d557633e2df9c962f579c https://doi.org/https://doi.org/10.1145/1143844.1143906

Me gusta

Guardar