October 1, 1965Open Access

The Robbins-Isbell Two-Armed-Bandit Problem with Finite Memory

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

This paper studies the sequential decision model known as the two-armed-bandit with finite memory. It was introduced by Robbins 8 in 1956 and studied further by Isbell 5 in 1959. In this paper, a set of rules is defined which are uniformly better than those given in 5 and 8. A much larger class of rules is then defined, one member of which is conjectured to be a uniformly best rule.

Me gusta

Guardar

Ver artículo completo