What question did this study set out to answer?

The research aims to understand how individuals adapt their strategies in competitive games based on opponent predictability and learning methods.

March 19, 2026

EXPRESS: People can adaptively exploit model-free and model-based reinforcement learning in competitive games

Key Points

The research aims to understand how individuals adapt their strategies in competitive games based on opponent predictability and learning methods.
Conducted two experiments with players against computerized opponents
Experiment 1 used model-free reinforcement learning algorithms
Experiment 2 utilized model-based reinforcement learning algorithms
Analyzed varying strategies employed by participants across both experiments
Participants significantly outperformed both model-free and model-based RL opponents
In Experiment 1, players used a Win-Stay/Lose-Shift strategy
In Experiment 2, players adopted a Win-Shift/Lose-Shift strategy
Findings contradicted predictions from traditional reinforcement learning models

Abstract

A key goal for organisms in competitive social interactions is learning strategies that outperform opponents. Despite the ample literature on modeling strategic behaviors in games, little research has parametrically examined the degree to which individuals' performance and strategy depends on an opponent's predictability and how they change over time. To test these questions, we conducted two experiments investigating peoples' behavior in the competitive game Rock, Paper, Scissors against computerized opponents programmed using reinforcement learning algorithms. In Experiment 1, the RL algorithms were model-free, where only the values of selected actions were updated following feedback. In Experiment 2, the algorithms were model-based, where the values of unselected actions were also updated. Results from both experiments showed that subjects significantly outperformed both classes of RL opponents, but they implemented different strategies to do so. Specifically, participants tended to engage in a Win-Stay/Lose-Shift strategy in Experiment 1 but a Win-Shift/Lose-Shift strategy in Experiment 2, contrary to behaviors predicted by typical RL models and learning theories. We discuss the theoretical and practical implications of shifting away from reinforced behaviors, reinforcement learning as a representative computational framework of strategic decision-making, and how future research can continue this investigation by testing additional models and competitive games.

AIに質問

Bookmark