Abstract Exploring novel environments through sequential sampling is essential for efficient decision-making under uncertainty. In the laboratory, human exploration has been studied in situations where it is traded against reward maximisation. By design, these ‘explore-exploit’ dilemmas confound the behavioural characteristics of exploration with those of the trade-off itself. Here, we propose a sequential sampling task where exploration can be compared in the presence and absence of trade-off with exploitation. Detailed model-based analyses of choices reveal specific exploration patterns arising when information seeking is not traded against reward seeking or influenced by prospective value. Human choices are directed toward the most uncertain option available, but only after an initial sampling phase consisting of repeated choices from each novel option. These findings outline competing cognitive pressures on information seeking: the repeated sampling of the current option (local uncertainty minimisation), and the directed sampling of the most uncertain option available (global uncertainty minimisation).
Almeras et al. (Thu,) studied this question.