skip to main content


Title: A special role for Anterior Cingulate Cortex, but not Orbitofrontal Cortex or Basolateral Amygdala, in choices involving information
Humans and other animals make decisions under uncertainty. Choosing an option that provides information can improve decision making. However, subjects often choose information that does not increase the chances of obtaining reward. In a procedure that promotes such paradoxical choice, animals choose between two alternatives: The richer option is followed by a cue that is rewarded 50% of the time (No-info) and the leaner option is followed by one of two cues, one always rewarded (100%), and the other never rewarded, 0% (Info). Since decisions involve comparing the subjective value of options after integrating all their features perhaps including information value, preference for information may rely on cortico-amygdalar circuitry. To test this, male and female Long-Evans rats were prepared with bilateral inhibitory DREADDs in the anterior cingulate cortex (ACC), orbitofrontal cortex (OFC), basolateral amygdala (BLA), or null virus infusions as a control. Using a counterbalanced design, we inhibited these regions after stable preference was acquired and during learning of new Info and No-info cues. We found that inhibition of ACC, but not OFC or BLA, selectively destabilized choice preference in female rats without affecting latency to choose or the response rate to cues. A logistic regression fit revealed that the previous choice strongly predicted preference in control animals, but not in female rats following ACC inhibition. BLA inhibition tended to decrease the learning of new cues that signaled the Info option, but had no effect on preference. The results reveal a causal, sex-dependent role for ACC in decisions involving information.  more » « less
Award ID(s):
1844144
NSF-PAR ID:
10458422
Author(s) / Creator(s):
Date Published:
Journal Name:
bioRxiv
ISSN:
2692-8205
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract The present study evaluated the role of inhibition in paradoxical choice in pigeons. In a paradoxical choice procedure, pigeons receive a choice between two alternatives. Choosing the “suboptimal” alternative is followed 20% of the time by one cue (the S+) that is always reinforced, and 80% of the time by another cue (S-) that is never reinforced. Thus, this alternative leads to an overall reinforcement rate of 20%. Choosing the “optimal” alternative, however, is followed by one of two cues (S3 or S4), each reinforced 50% of the time. Thus, this alternative leads to an overall reinforcement rate of 50%. González and Blaisdell (2021) reported that development of paradoxical choice was positively correlated to the development of inhibition to the S- (signal that no food will be delivered on that trial) post-choice stimulus. The current experiment tested the hypothesis that inhibition to a post-choice stimulus is causally related to suboptimal preference. Following acquisition of suboptimal preference, pigeons received two manipulations: in one condition one of the cues in the optimal alternative (S4) was extinguished and, in another condition, the S- cue was partially reinforced. When tested on the choice task afterward, both manipulations resulted in a decrement in suboptimal preference. This result is paradoxical given that both manipulations made the suboptimal alternative the richer option. We discuss the implications of our results, arguing that inhibition of a post-choice cue increases attraction to or value of that choice. 
    more » « less
  2. Survival relies on the ability to flexibly choose between different actions according to varying environmental circumstances. Many lines of evidence indicate that action selection involves signaling in corticostriatal circuits, including the orbitofrontal cortex (OFC) and dorsomedial striatum (DMS). While choice-specific responses have been found in individual neurons from both areas, it is unclear whether populations of OFC or DMS neurons are better at encoding an animal’s choice. To address this, we trained head-fixed mice to perform an auditory guided two-alternative choice task, which required moving a joystick forward or backward. We then used silicon microprobes to simultaneously measure the spiking activity of OFC and DMS ensembles, allowing us to directly compare population dynamics between these areas within the same animals. Consistent with previous literature, both areas contained neurons that were selective for specific stimulus-action associations. However, analysis of concurrently recorded ensemble activity revealed that the animal’s trial-by-trial behavior could be decoded more accurately from DMS dynamics. These results reveal substantial regional differences in encoding action selection, suggesting that DMS neural dynamics are more specialized than OFC at representing an animal’s choice of action. NEW & NOTEWORTHY While previous literature shows that both orbitofrontal cortex (OFC) and dorsomedial striatum (DMS) represent information relevant to selecting specific actions, few studies have directly compared neural signals between these areas. Here we compared OFC and DMS dynamics in mice performing a two-alternative choice task. We found that the animal’s choice could be decoded more accurately from DMS population activity. This work provides among the first evidence that OFC and DMS differentially represent information about an animal’s selected action. 
    more » « less
  3. Abstract

    A goal of the comparative approach is to test a variety of species on the same task. Here, we examined whether the factors that helped capuchin monkeys improve their performance in a dichotomous choice task would generalize to three other primate species: orangutans, gorillas, and drill monkeys. In this task, subjects have access to two options, each resulting in an identical food, but one (the ephemeral option) is only available if it is chosen first, whereas the other one (the permanent option) is always available. Therefore, the food‐maximizing solution is to choose the ephemeral option first, followed by the permanent option for an additional reward. On the original version (plate task), the options were discriminated by the color and pattern of the plates holding the food, while on two subsequent versions we used altered cues that we predicted would improve performance: (1) the color of the foods themselves (color task), which we hypothesized was relevant to primates, who choose foods rather than substrates on which foods are found when foraging, and (2) patterned cups covering the foods (cup task), which we hypothesized would help primates avoid the prepotent response associated with visible food. Like capuchins, all three species initially failed to solve the plate task. However, while orangutans improved their performance from the plate to the color task, they did not for the cup task, and only a few gorillas and no drills succeeded in either task. Unfortunately, our ability to interpret these data was obscured by differences in the subjects' level of experience with cognitive testing and practical constraints that precluded the use of completely identical procedures across species. Nonetheless, we consider what these results can tell us, and discuss the value of conducting studies across multiple sites despite unavoidable differences.

     
    more » « less
  4. null (Ed.)
    In our everyday lives, we often have to choose between many different options. When deciding what to order off a menu, for example, or what type of soda to buy in the supermarket, we have a range of possibilities to consider. So how do we decide what to go for? Researchers believe we make such choices by assigning a subjective value to each of the available options. But we can do this in several different ways. We could look at every option in turn, and then choose the best one once we have considered them all. This is a so-called ‘rational’ decision-making approach. But we could also consider each of the options one at a time and stop as soon as we find one that is good enough. This strategy is known as ‘satisficing’. In both approaches, we use our eyes to gather information about the items available. Most scientists have assumed that merely looking at an item – such as a particular brand of soda – does not affect how we feel about that item. But studies in which animals or people choose between much smaller sets of objects – usually up to four – suggest otherwise. The results from these studies indicate that looking at an item makes that item more attractive to the observer, thereby increasing its subjective value. Thomas et al. now show that gaze also plays an active role in the decision-making process when people are spoilt for choice. Healthy volunteers looked at pictures of up to 36 snack foods on a screen and were asked to select the one they would most like to eat. The researchers then recorded the volunteers’ choices and response times, and used eye-tracking technology to follow the direction of their gaze. They then tested which of the various decision-making strategies could best account for all the behaviour. The results showed that the volunteers’ behaviour was best explained by computer models that assumed that looking at an item increases its subjective value. Moreover, the results confirmed that we do not examine all items and then choose the best one. But neither do we use a purely satisficing approach: the volunteers chose the last item they had looked at less than half the time. Instead, we make decisions by comparing individual items against one another, going back and forth between them. The longer we look at an item, the more attractive it becomes, and the more likely we are to choose it. 
    more » « less
  5. Abstract

    The meta-reinforcement learning (meta-RL) framework, which involves RL over multiple timescales, has been successful in training deep RL models that generalize to new environments. It has been hypothesized that the prefrontal cortex may mediate meta-RL in the brain, but the evidence is scarce. Here we show that the orbitofrontal cortex (OFC) mediates meta-RL. We trained mice and deep RL models on a probabilistic reversal learning task across sessions during which they improved their trial-by-trial RL policy through meta-learning. Ca2+/calmodulin-dependent protein kinase II-dependent synaptic plasticity in OFC was necessary for this meta-learning but not for the within-session trial-by-trial RL in experts. After meta-learning, OFC activity robustly encoded value signals, and OFC inactivation impaired the RL behaviors. Longitudinal tracking of OFC activity revealed that meta-learning gradually shapes population value coding to guide the ongoing behavioral policy. Our results indicate that two distinct RL algorithms with distinct neural mechanisms and timescales coexist in OFC to support adaptive decision-making.

     
    more » « less