Key points are not available for this paper at this time.
We consider the problem of successively choosing one of two ways of action, each of which may lead to success or failure, in such a way as to maximize the long-run proportion of successes obtained, the choice each time being based on the results of a fixed number of the previous trials.
Herbert Robbins (Sat,) studied this question.