What question did this study set out to answer?

This review aims to understand how social learning interacts with the exploration-exploitation dilemma in decision-making.

March 5, 2026Open Access

Social learning and exploration–exploitation dilemma in decision-making

Q: What does this research mean for the field?

Humans exhibit a reliability-seeking bias in social learning, preferring to learn from consistent, exploitation-oriented partners rather than highly exploratory ones. Novelty: ClaimNovelty.NOVEL_FINDING. Consensus alignment: ConsensusAlignment.CHALLENGES_CONSENSUS.

Key Points

This review aims to understand how social learning interacts with the exploration-exploitation dilemma in decision-making.
Analyze neurocomputational principles underlying social learning.
Discuss neural mechanisms in the ventromedial and lateral prefrontal cortices.
Examine strategic behaviors shaped by social environments.
Present evidence regarding source selection biases in learning.
Identified a reliability-seeking bias in humans preferring consistent partners for learning.
Highlighted the impact of social cues like competence and predictability on decision-making.
Outlined limitations of current paradigms in evaluating the exploration-exploitation trade-off.

Abstract

This mini review examines the neurocomputational principles of social learning through the lens of the exploration–exploitation dilemma. While the neural mechanisms of learning from others—mediated by distinct signals in the ventromedial and lateral prefrontal cortices—are well established, less is known about how these mechanisms interact with the fundamental trade-off between gathering information (“exploration”) and maximizing rewards (“exploitation”). We discuss how social environments shape this trade-off, leading to strategic behaviors such as informational free-riding or conformity. A central focus of this review is the issue of source selection: how agents decide whom to observe. We present recent evidence suggesting that, contrary to the predictions of optimal information-seeking theories, humans often exhibit a “reliability-seeking” bias, preferring to learn from consistent, exploitation-oriented partners rather than highly exploratory ones. We conclude by discussing the limitations of current paradigms, specifically the inherent confounding of social cues such as competence and predictability, and outline a computational framework for isolating the specific drivers of adaptive social decision-making.

Bookmark

View Full Paper