The contribution of distributional activities of dopamine in the exploration

Wang; Mien Brabeeba; Lynch, Nancy; Halassa, Michael

Decision making in natural settings requires efficient exploration to handle uncertainty. Since associations between actions and outcomes are uncertain, animals need to balance the explorations and exploitation to select the actions that lead to maximal rewards. The computa- tional principles by which animal brains explore during decision-making are poorly understood. Our challenge here was to build a biologically plausible neural network that efficiently explores an environment and understands its effectiveness mathematically. One of the most evolutionarily conserved and important systems in decision making is basal ganglia (BG)1. In particular, the dopamine activities (DA) in BG is thought to represent reward prediction error (RPE) to facilitate reinforcement learning2. Therefore, our starting point is a cortico-BG loop motif3. This network adjusts exploration based on neuronal noises and updates its value estimate through RPE. To account for the fact that animals adjust exploration based on experience, we modified the network in two ways. First, it is recently discovered that DA does not simply represent the scalar RPE value; rather it represents RPE in a distribution4. We incorporated the distributional RPE framework and further the hypothesis, allowing an RPE distribution to update the posterior of action values encoded by cortico-BG connections. Second, it is known that the firing in the layer 2/3 of cortex fires is variable and sparse5. Our network thus included a random sparsification of cortical activity as a mechanism of sampling from this posterior for experience-based exploration. Combining these two features, our network is able to take the uncertainty of our value estimates into account to accomplish efficient exploration in a variety of environments.

More Like this