skip to main content


Title: Brain-optimized deep neural network models of human visual areas learn non-hierarchical representations
Abstract

Deep neural networks (DNNs) optimized for visual tasks learn representations that align layer depth with the hierarchy of visual areas in the primate brain. One interpretation of this finding is that hierarchical representations are necessary to accurately predict brain activity in the primate visual system. To test this interpretation, we optimized DNNs to directly predict brain activity measured with fMRI in human visual areas V1-V4. We trained a single-branch DNN to predict activity in all four visual areas jointly, and a multi-branch DNN to predict each visual area independently. Although it was possible for the multi-branch DNN to learn hierarchical representations, only the single-branch DNN did so. This result shows that hierarchical representations are not necessary to accurately predict human brain activity in V1-V4, and that DNNs that encode brain-like visual representations may differ widely in their architecture, ranging from strict serial hierarchies to multiple independent branches.

 
more » « less
NSF-PAR ID:
10420703
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Nature Communications
Volume:
14
Issue:
1
ISSN:
2041-1723
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract The mammalian sensory neocortex consists of hierarchically organized areas reciprocally connected via feedforward (FF) and feedback (FB) circuits. Several theories of hierarchical computation ascribe the bulk of the computational work of the cortex to looped FF-FB circuits between pairs of cortical areas. However, whether such corticocortical loops exist remains unclear. In higher mammals, individual FF-projection neurons send afferents almost exclusively to a single higher-level area. However, it is unclear whether FB-projection neurons show similar area-specificity, and whether they influence FF-projection neurons directly or indirectly. Using viral-mediated monosynaptic circuit tracing in macaque primary visual cortex (V1), we show that V1 neurons sending FF projections to area V2 receive monosynaptic FB inputs from V2, but not other V1-projecting areas. We also find monosynaptic FB-to-FB neuron contacts as a second motif of FB connectivity. Our results support the existence of FF-FB loops in primate cortex, and suggest that FB can rapidly and selectively influence the activity of incoming FF signals. 
    more » « less
  2. Abstract

    Organisms process sensory information in the context of their own moving bodies, an idea referred to as embodiment. This idea is important for developmental neuroscience, robotics and systems neuroscience. The mechanisms supporting embodiment are unknown, but a manifestation could be the observation in mice of brain-wide neuromodulation, including in the primary visual cortex, driven by task-irrelevant spontaneous body movements. We tested this hypothesis in macaque monkeys (Macaca mulatta), a primate model for human vision, by simultaneously recording visual cortex activity and facial and body movements. We also sought a direct comparison using an analogous approach to those used in mouse studies. Here we found that activity in the primate visual cortex (V1, V2 and V3/V3A) was associated with the animals’ own movements, but this modulation was largely explained by the impact of the movements on the retinal image, that is, by changes in visual input. These results indicate that visual cortex in primates is minimally driven by spontaneous movements and may reflect species-specific sensorimotor strategies.

     
    more » « less
  3. Abstract

    The cerebral cortex of primates encompasses multiple anatomically and physiologically distinct areas processing visual information. Areas V1, V2, and V5/MT are conserved across mammals and are central for visual behavior. To facilitate the generation of biologically accurate computational models of primate early visual processing, here we provide an overview of over 350 published studies of these three areas in the genus Macaca, whose visual system provides the closest model for human vision. The literature reports 14 anatomical connection types from the lateral geniculate nucleus of the thalamus to V1 having distinct layers of origin or termination, and 194 connection types between V1, V2, and V5, forming multiple parallel and interacting visual processing streams. Moreover, within V1, there are reports of 286 and 120 types of intrinsic excitatory and inhibitory connections, respectively. Physiologically, tuning of neuronal responses to 11 types of visual stimulus parameters has been consistently reported. Overall, the optimal spatial frequency (SF) of constituent neurons decreases with cortical hierarchy. Moreover, V5 neurons are distinct from neurons in other areas for their higher direction selectivity, higher contrast sensitivity, higher temporal frequency tuning, and wider SF bandwidth. We also discuss currently unavailable data that could be useful for biologically accurate models.

     
    more » « less
  4. Andreas Krause, Barbara Engelhardt (Ed.)
    Reconstructing natural images from fMRI recordings is a challenging task of great importance in neuroscience. The current architectures are bottlenecked because they fail to effectively capture the hierarchical processing of visual stimuli that takes place in the human brain. Motivated by that fact, we introduce a novel neural network architecture for the problem of neural decoding. Our architecture uses Hierarchical Variational Autoencoders (HVAEs) to learn meaningful representations of natural images and leverages their latent space hierarchy to learn voxel-to-image mappings. By mapping the early stages of the visual pathway to the first set of latent variables and the higher visual cortex areas to the deeper layers in the latent hierarchy, we are able to construct a latent variable neural decoding model that replicates the hierarchical visual information processing. Our model achieves better reconstructions compared to the state of the art and our ablation study indicates that the hierarchical structure of the latent space is responsible for that performance. 
    more » « less
  5. To fluidly engage with the world, our brains must simultaneously represent both the scene in front of us and our memory of the immediate surrounding environment (i.e., local visuospatial context). How does the brain's functional architecture enable sensory and mnemonic representations to closely interface while also avoiding sensory-mnemonic interference? Here, we asked this question using first-person, head-mounted virtual reality and fMRI. Using virtual reality, human participants of both sexes learned a set of immersive, real-world visuospatial environments in which we systematically manipulated the extent of visuospatial context associated with a scene image in memory across three learning conditions, spanning from a single FOV to a city street. We used individualized, within-subject fMRI to determine which brain areas support memory of the visuospatial context associated with a scene during recall (Experiment 1) and recognition (Experiment 2). Across the whole brain, activity in three patches of cortex was modulated by the amount of known visuospatial context, each located immediately anterior to one of the three scene perception areas of high-level visual cortex. Individual subject analyses revealed that these anterior patches corresponded to three functionally defined place memory areas, which selectively respond when visually recalling personally familiar places. In addition to showing activity levels that were modulated by the amount of visuospatial context, multivariate analyses showed that these anterior areas represented the identity of the specific environment being recalled. Together, these results suggest a convergence zone for scene perception and memory of the local visuospatial context at the anterior edge of high-level visual cortex.

    SIGNIFICANCE STATEMENTAs we move through the world, the visual scene around us is integrated with our memory of the wider visuospatial context. Here, we sought to understand how the functional architecture of the brain enables coexisting representations of the current visual scene and memory of the surrounding environment. Using a combination of immersive virtual reality and fMRI, we show that memory of visuospatial context outside the current FOV is represented in a distinct set of brain areas immediately anterior and adjacent to the perceptually oriented scene-selective areas of high-level visual cortex. This functional architecture would allow efficient interaction between immediately adjacent mnemonic and perceptual areas while also minimizing interference between mnemonic and perceptual representations.

     
    more » « less