skip to main content

Title: The ventral striatum dissociates information expectation, reward anticipation, and reward receipt

Do dopaminergic reward structures represent the expected utility of information similarly to a reward? Optimal experimental design models from Bayesian decision theory and statistics have proposed a theoretical framework for quantifying the expected value of information that might result from a query. In particular, this formulation quantifies the value of information before the answer to that query is known, in situations where payoffs are unknown and the goal is purely epistemic: That is, to increase knowledge about the state of the world. Whether and how such a theoretical quantity is represented in the brain is unknown. Here we use an event-related functional MRI (fMRI) task design to disentangle information expectation, information revelation and categorization outcome anticipation, and response-contingent reward processing in a visual probabilistic categorization task. We identify a neural signature corresponding to the expectation of information, involving the left lateral ventral striatum. Moreover, we show a temporal dissociation in the activation of different reward-related regions, including the nucleus accumbens, medial prefrontal cortex, and orbitofrontal cortex, during information expectation versus reward-related processing.

; ; ; ;
Publication Date:
Journal Name:
Proceedings of the National Academy of Sciences
Page Range or eLocation-ID:
p. 15200-15208
Proceedings of the National Academy of Sciences
Sponsoring Org:
National Science Foundation
More Like this
  1. Perceptual decision-making has been shown to be influenced by reward expected from alternative options or actions, but the underlying neural mechanisms are currently unknown. More specifically, it is debated whether reward effects are mediated through changes in sensory processing, later stages of decision-making, or both. To address this question, we conducted two experiments in which human participants made saccades to what they perceived to be either the first or second of two visually identical but asynchronously presented targets while we manipulated expected reward from correct and incorrect responses on each trial. By comparing reward-induced bias in target selection (i.e., rewardmore »bias) during the two experiments, we determined whether reward caused changes in sensory or decision-making processes. We found similar reward biases in the two experiments indicating that reward information mainly influenced later stages of decision-making. Moreover, the observed reward biases were independent of the individual's sensitivity to sensory signals. This suggests that reward effects were determined heuristically via modulation of decision-making processes instead of sensory processing. To further explain our findings and uncover plausible neural mechanisms, we simulated our experiments with a cortical network model and tested alternative mechanisms for how reward could exert its influence. We found that our experimental observations are more compatible with reward-dependent input to the output layer of the decision circuit. Together, our results suggest that, during a temporal judgment task, reward exerts its influence via changing later stages of decision-making (i.e., response bias) rather than early sensory processing (i.e., perceptual bias).« less
  2. Affective neuroscience research suggests that maturational changes in reward circuitry during adolescence present opportunities for new learning, but likely also contribute to increases in vulnerability for psychiatric disorders such as depression and substance abuse. Basic research in animal models and human neuroimaging has made progress in understanding the normal development of reward circuitry in adolescence, yet, few functional neuroimaging studies have examined puberty-related influences on the functioning of this circuitry. The goal of this study was to address this gap by examining the extent to which striatal activation and cortico-striatal functional connectivity to cues predicting upcoming rewards would be positivelymore »associated with pubertal status and levels of pubertal hormones (dehydroepiandrosterone, testosterone, estradiol). Participants included 79 adolescents (10-13 year olds; 47 girls) varying in pubertal status who performed a novel reward cue processing task during fMRI. Pubertal maturation was assessed using sex-specific standardized composite measures based on Tanner staging (self-report and clinical assessment) and scores from the Pubertal Development Scale. These composite measures were computed to index overall pubertal maturation as well as maturation of the adrenal and gonadal axes separately for boys and girls. Basal levels of circulating pubertal hormones were measured using immunoassays from three samples collected weekly upon awakening across a three-week period. Results indicated greater striatal activation and functional connectivity between nucleus accumbens (NAcc) and medial prefrontal cortex (mPFC) to reward cue (vs. no reward cue) on this task. Also, girls with higher levels of estradiol showed reduced activation in left and right caudate and greater NAcc-putamen connectivity. Girls with higher levels of testosterone showed greater NAcc connectivity with the anterior cingulate cortex and the insula. There were no significant associations in boys. Findings suggest that patterns of activation and connectivity in cortico-striatal regions are associated with reward cue processing, particularly in girls. Longitudinal follow-up neuroimaging studies are needed to fully characterize puberty-specific effects on the development of these neural regions and how such changes may contribute to pathways of risk or resilience in adolescence.« less
  3. Abstract

    Learning signals during reinforcement learning and cognitive control rely on valenced reward prediction errors (RPEs) and non-valenced salience prediction errors (PEs) driven by surprise magnitude. A core debate in reward learning focuses on whether valenced and non-valenced PEs can be isolated in the human electroencephalogram (EEG). We combine behavioral modeling and single-trial EEG regression to disentangle sequential PEs in an interval timing task dissociating outcome valence, magnitude, and probability. Multiple regression across temporal, spatial, and frequency dimensions characterized a spatio-tempo-spectral cascade from early valenced RPE value to non-valenced RPE magnitude, followed by outcome probability indexed by a late frontalmore »positivity. Separating negative and positive outcomes revealed the valenced RPE value effect is an artifact of overlap between two non-valenced RPE magnitude responses: frontal theta feedback-related negativity on losses and posterior delta reward positivity on wins. These results reconcile longstanding debates on the sequence of components representing reward and salience PEs in the human EEG.

    « less
  4. Primate vision is characterized by constant, sequential processing and selection of visual targets to fixate. Although expected reward is known to influence both processing and selection of visual targets, similarities and differences between these effects remain unclear mainly because they have been measured in separate tasks. Using a novel paradigm, we simultaneously measured the effects of reward outcomes and expected reward on target selection and sensitivity to visual motion in monkeys. Monkeys freely chose between two visual targets and received a juice reward with varying probability for eye movements made to either of them. Targets were stationary apertures of driftingmore »gratings, causing the end points of eye movements to these targets to be systematically biased in the direction of motion. We used this motion-induced bias as a measure of sensitivity to visual motion on each trial. We then performed different analyses to explore effects of objective and subjective reward values on choice and sensitivity to visual motion to find similarities and differences between reward effects on these two processes. Specifically, we used different reinforcement learning models to fit choice behavior and estimate subjective reward values based on the integration of reward outcomes over multiple trials. Moreover, to compare the effects of subjective reward value on choice and sensitivity to motion directly, we considered correlations between each of these variables and integrated reward outcomes on a wide range of timescales. We found that, in addition to choice, sensitivity to visual motion was also influenced by subjective reward value, although the motion was irrelevant for receiving reward. Unlike choice, however, sensitivity to visual motion was not affected by objective measures of reward value. Moreover, choice was determined by the difference in subjective reward values of the two options, whereas sensitivity to motion was influenced by the sum of values. Finally, models that best predicted visual processing and choice used sets of estimated reward values based on different types of reward integration and timescales. Together, our results demonstrate separable influences of reward on visual processing and choice, and point to the presence of multiple brain circuits for the integration of reward outcomes.« less
  5. Individuals with suicidal thoughts and behaviors experience abnormalities in reward-related processes, yet little is known about specific components or stages of reward processing that are impaired, especially in children. The primary aim of this study was to conduct an investigation of the Initial Response to Reward subconstruct of the National Institute of Mental Health’s Research Domain Criteria in relation to recent suicidal ideation (SI) in children. Participants were 23 children between the ages of 7 and 11 with a history of recent SI and 46 demographically and clinically matched children with no recent SI. Children completed a simple guessing taskmore »during which electroencephalographic signals were continuously recorded to isolate the reward positivity (RewP) event-related potential; specifically, we examined change in RewP (∆RewP), quantified as the difference between neural responses to monetary gains and neural responses to monetary losses. Children with recent SI exhibited significantly smaller (i.e., blunted) ∆RewP, providing initial evidence for blunted initial responses to reward in these children.

    « less