Key points are not available for this paper at this time.
Dopamine signaling has become closely associated with reward prediction errors (RPEs)-the difference between expected and experienced value. Although not without controversy, the dopamine RPE hypothesis is one of the most influential ideas in neuroscience. This review briefly summarizes its origins, empirical foundations, and theoretical development. We begin with early psychological studies which demonstrated that prediction errors, broadly defined, are central drivers of learning. These experiments inspired mathematical models that formalized associative learning rules and informed the development of reinforcement learning algorithms for artificial learning, including the influential temporal difference learning (TDRL) framework, where learning is guided by prediction errors in value or reward. These theoretical proposals converged with neuroscience through the landmark discovery that midbrain dopamine neurons show activity patterns that are strikingly similar to the RPEs proposed in TDRL. The idea that this unique neuronal population, already implicated in several behavioral processes and brain disorders, could encode a computational variable central to reinforcement learning algorithms was a major conceptual shift, and provided a strong framework that allowed for rigorous hypothesis testing. Over the past three decades, increasingly sophisticated experiments have both replicated the core dopamine RPE finding across distinct experimental contexts and revealed important deviations from the canonical model predictions. These exceptions have sparked ongoing debate about how the hypothesis should be enhanced, revised, or replaced. The history of the dopamine RPE hypothesis is a quintessential example of how the integration of theory and experiments can drive progress in neuroscience and offers a template for theoretical-experimental synthesis.
Dudhabhate et al. (Thu,) studied this question.