skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Theory and model of thalamocortical processing in decision making under uncertainty. Thalamocortical Interactions Gordon Research Conference 2024, Ventura, California, February 2024. Abstract, Poster.
Animals flexibly select actions that maximize future rewards despite facing uncertainty in sen- sory inputs, action-outcome associations or contexts. The computational and circuit mechanisms underlying this ability are poorly understood. A clue to such computations can be found in the neural systems involved in representing sensory features, sensorimotor-outcome associations and contexts. Specifically, the basal ganglia (BG) have been implicated in forming sensorimotor-outcome association [1] while the thalamocortical loop between the prefrontal cortex (PFC) and mediodorsal thalamus (MD) has been shown to engage in contextual representations [2, 3]. Interestingly, both human and non-human animal experiments indicate that the MD represents different forms of uncertainty [3, 4]. However, finding evidence for uncertainty representation gives little insight into how it is utilized to drive behavior. Normative theories have excelled at providing such computational insights. For example, de- ploying traditional machine learning algorithms to fit human decision-making behavior has clarified how associative uncertainty alters exploratory behavior [5, 6]. However, despite their computa- tional insight and ability to fit behaviors, normative models cannot be directly related to neural mechanisms. Therefore, a critical gap exists between what we know about the neural representa- tion of uncertainty on one end and the computational functions uncertainty serves in cognition. This gap can be filled with mechanistic neural models that can approximate normative models as well as generate experimentally observed neural representations. In this work, we build a mechanistic cortico-thalamo-BG loop network model that directly fills this gap. The model includes computationally-relevant mechanistic details of both BG and thalamocortical circuits such as distributional activities of dopamine [7] and thalamocortical pro- jection modulating cortical effective connectivity [3] and plasticity [8] via interneurons. We show that our network can more efficiently and flexibly explore various environments compared to com- monly used machine learning algorithms and we show that the mechanistic features we include are crucial for handling different types of uncertainty in decision-making. Furthermore, through derivation and mathematical proofs, we approximate our models to two novel normative theories. We show mathematically the first has near-optimal performance on bandit tasks. The second is a generalization on the well-known CUMSUM algorithm, which is known to be optimal on single change point detection tasks [9]. Our normative model expands on this by detecting multiple sequential contextual changes. To our knowledge, our work is the first to link computational in- sights, normative models and neural realization together in decision-making under various forms of uncertainty.  more » « less
Award ID(s):
2139936 2003830
PAR ID:
10501701
Author(s) / Creator(s):
; ;
Publisher / Repository:
Thalamocortical Interactions Gordon Research Conference 2024
Date Published:
Journal Name:
Thalamocortical Interactions Gordon Research Conference 2024
Format(s):
Medium: X
Location:
Ventura, California
Sponsoring Org:
National Science Foundation
More Like this
  1. Animal brains evolved to optimize behavior in dynamic environments, flexibly selecting actions that maximize future rewards in different contexts. A large body of experimental work indicates that such optimization changes the wiring of neural circuits, appropriately mapping environmental input onto behavioral outputs. A major unsolved scientific question is how optimal wiring adjustments, which must target the connections responsible for rewards, can be accomplished when the relation between sensory inputs, action taken, environmental context with rewards is ambiguous. The credit assignment problem can be categorized into context-independent structural credit assignment and context-dependent continual learning. In this perspective, we survey prior approaches to these two problems and advance the notion that the brain’s specialized neural architectures provide efficient solutions. Within this framework, the thalamus with its cortical and basal ganglia interactions serves as a systems-level solution to credit assignment. Specifically, we propose that thalamocortical interaction is the locus of meta-learning where the thalamus provides cortical control functions that parametrize the cortical activity association space. By selecting among these control functions, the basal ganglia hierarchically guide thalamocortical plasticity across two timescales to enable meta-learning. The faster timescale establishes contextual associations to enable behavioral flexibility while the slower one enables generalization to new contexts. 
    more » « less
  2. Decision making in natural settings requires efficient exploration to handle uncertainty. Since associations between actions and outcomes are uncertain, animals need to balance the explorations and exploitation to select the actions that lead to maximal rewards. The computa- tional principles by which animal brains explore during decision-making are poorly understood. Our challenge here was to build a biologically plausible neural network that efficiently explores an environment and understands its effectiveness mathematically. One of the most evolutionarily conserved and important systems in decision making is basal ganglia (BG)1. In particular, the dopamine activities (DA) in BG is thought to represent reward prediction error (RPE) to facilitate reinforcement learning2. Therefore, our starting point is a cortico-BG loop motif3. This network adjusts exploration based on neuronal noises and updates its value estimate through RPE. To account for the fact that animals adjust exploration based on experience, we modified the network in two ways. First, it is recently discovered that DA does not simply represent the scalar RPE value; rather it represents RPE in a distribution4. We incorporated the distributional RPE framework and further the hypothesis, allowing an RPE distribution to update the posterior of action values encoded by cortico-BG connections. Second, it is known that the firing in the layer 2/3 of cortex fires is variable and sparse5. Our network thus included a random sparsification of cortical activity as a mechanism of sampling from this posterior for experience-based exploration. Combining these two features, our network is able to take the uncertainty of our value estimates into account to accomplish efficient exploration in a variety of environments. 
    more » « less
  3. Decision making in natural settings requires efficient exploration to handle uncertainty. Since associations between actions and outcomes are uncertain, animals need to balance the explorations and exploitation to select the actions that lead to maximal rewards. The computa- tional principles by which animal brains explore during decision-making are poorly understood. Our challenge here was to build a biologically plausible neural network that efficiently explores an environment and understands its effectiveness mathematically. One of the most evolutionarily conserved and important systems in decision making is basal ganglia (BG)1. In particular, the dopamine activities (DA) in BG is thought to represent reward prediction error (RPE) to facilitate reinforcement learning2. Therefore, our starting point is a cortico-BG loop motif3. This network adjusts exploration based on neuronal noises and updates its value estimate through RPE. To account for the fact that animals adjust exploration based on experience, we modified the network in two ways. First, it is recently discovered that DA does not simply represent the scalar RPE value; rather it represents RPE in a distribution4. We incorporated the distributional RPE framework and further the hypothesis, allowing an RPE distribution to update the posterior of action values encoded by cortico-BG connections. Second, it is known that the firing in the layer 2/3 of cortex fires is variable and sparse5. Our network thus included a random sparsification of cortical activity as a mechanism of sampling from this posterior for experience-based exploration. Combining these two features, our network is able to take the uncertainty of our value estimates into account to accomplish efficient exploration in a variety of environments. 
    more » « less
  4. Genetics are recognized as a significant risk factor in schizophrenia [1], and computational modeling studies have highlighted deficits in belief updating as a key aspect of the disorder and an underlying cause of delusion [2]. In particular, the patients often show strong priors on envi- ronmental volatility. However, the intricate mechanisms bridging these genetic risk factors and belief updating deficits remain poorly understood. Our challenge here was to build a biologically plausible neural network that provides a link between genetic risk factors and impaired belief updating. In constructing our schizophrenia model, we first focused on the prefrontal cortex (PFC)- mediodorsal thalamus (MD) circuit, given mounting evidence implicating alterations in these regions in schizophrenia pathology [3]. Drawing from experimental findings demonstrating the involvement of MD neurons expressing D2 receptors in cognitive flexibility [4], the known asso- ciation of D2 receptor genes with heightened schizophrenia risk [5], and the predominant mode of action of antipsychotic treatments as dopamine antagonists at D2 receptors [6], we simulated schizophrenia by reducing the excitability of MD neurons to mimic the hyperactive D2 receptors in Schizophrenia. To investigate the belief updating process, we consider a probability reversal task, in which the reward structure switches in blocks for every 200 trials. Our normal thalamocortical model is capable of flexibly switching across blocks and its PFC-MD connections learn the contextual model of the world, a neural signature for continual learning. We further mathematically analyze the model and deduce that under mild assumptions, the model approximates CUSUM algorithm, an algorithm known for its optimality in detecting environmental changes [7]. On the other hand, our schizophrenia model exhibited a stronger bias towards environmental volatility, prompting exploratory behaviors following contextual switches. By mathematical analysis, we deduce that the decreased excitability makes the evidence accumulation dynamics leaky and therefore the model can sporadically switch, consistent with the qualitative results in Schizophrenia patients [2]. Additionally, decreased excitability in MD compromised the ability of PFC-MD connections to accurately learn the environmental model. To address this impairment, we applied current injections to MD to restore activity levels to a range conducive to Hebbian plasticity. Remarkably, the rescue model demonstrated reduced exploratory behavior following switches and exhibited a higher threshold for MD activity switching, indicative of a diminished bias towards environmental volatility. Moreover, the rescue model exhibited improved learning of the environmental model within its PFC-MD connections. These findings suggest a potential mechanism for utilizing deep brain stimulation at a novel site to mitigate schizophrenia symptoms. 
    more » « less
  5. The intrinsic uncertainty of sensory information (i.e., evidence) does not necessarily deter an observer from making a reliable decision. Indeed, uncertainty can be reduced by integrating (accumulating) incoming sensory evidence. It is widely thought that this accumulation is instantiated via recurrent rate-code neural networks. Yet, these networks do not fully explain important aspects of perceptual decision-making, such as a subject’s ability to retain accumulated evidence during temporal gaps in the sensory evidence. Here, we utilized computational models to show that cortical circuits can switch flexibly between “retention” and “integration” modes during perceptual decision-making. Further, we found that, depending on how the sensory evidence was readout, we could simulate “stepping” and “ramping” activity patterns, which may be analogous to those seen in different studies of decision-making in the primate parietal cortex. This finding may reconcile these previous empirical studies because it suggests these two activity patterns emerge from the same mechanism. 
    more » « less