skip to main content


Title: Scene Perception and Visuospatial Memory Converge at the Anterior Edge of Visually Responsive Cortex

To fluidly engage with the world, our brains must simultaneously represent both the scene in front of us and our memory of the immediate surrounding environment (i.e., local visuospatial context). How does the brain's functional architecture enable sensory and mnemonic representations to closely interface while also avoiding sensory-mnemonic interference? Here, we asked this question using first-person, head-mounted virtual reality and fMRI. Using virtual reality, human participants of both sexes learned a set of immersive, real-world visuospatial environments in which we systematically manipulated the extent of visuospatial context associated with a scene image in memory across three learning conditions, spanning from a single FOV to a city street. We used individualized, within-subject fMRI to determine which brain areas support memory of the visuospatial context associated with a scene during recall (Experiment 1) and recognition (Experiment 2). Across the whole brain, activity in three patches of cortex was modulated by the amount of known visuospatial context, each located immediately anterior to one of the three scene perception areas of high-level visual cortex. Individual subject analyses revealed that these anterior patches corresponded to three functionally defined place memory areas, which selectively respond when visually recalling personally familiar places. In addition to showing activity levels that were modulated by the amount of visuospatial context, multivariate analyses showed that these anterior areas represented the identity of the specific environment being recalled. Together, these results suggest a convergence zone for scene perception and memory of the local visuospatial context at the anterior edge of high-level visual cortex.

SIGNIFICANCE STATEMENTAs we move through the world, the visual scene around us is integrated with our memory of the wider visuospatial context. Here, we sought to understand how the functional architecture of the brain enables coexisting representations of the current visual scene and memory of the surrounding environment. Using a combination of immersive virtual reality and fMRI, we show that memory of visuospatial context outside the current FOV is represented in a distinct set of brain areas immediately anterior and adjacent to the perceptually oriented scene-selective areas of high-level visual cortex. This functional architecture would allow efficient interaction between immediately adjacent mnemonic and perceptual areas while also minimizing interference between mnemonic and perceptual representations.

 
more » « less
NSF-PAR ID:
10433625
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
DOI PREFIX: 10.1523
Date Published:
Journal Name:
The Journal of Neuroscience
Volume:
43
Issue:
31
ISSN:
0270-6474
Page Range / eLocation ID:
p. 5723-5737
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Human childhood is characterized by dramatic changes in the mind and brain. However, little is known about the large-scale intrinsic cortical network changes that occur during childhood because of methodological challenges in scanning young children. Here, we overcome this barrier by using sophisticated acquisition and analysis tools to investigate functional network development in children between the ages of 4 and 10 years (n=92; 50 female, 42 male). At multiple spatial scales, age is positively associated with brain network segregation. At the system level, age was associated with segregation of systems involved in attention from those involved in abstract cognition, and with integration among attentional and perceptual systems. Associations between age and functional connectivity are most pronounced in visual and medial prefrontal cortex, the two ends of a gradient from perceptual, externally oriented cortex to abstract, internally oriented cortex. These findings suggest that both ends of the sensory-association gradient may develop early, in contrast to the classical theories that cortical maturation proceeds from back to front, with sensory areas developing first and association areas developing last. More mature patterns of brain network architecture, controlling for age, were associated with better visuospatial reasoning abilities. Our results suggest that as cortical architecture becomes more specialized, children become more able to reason about the world and their place in it.

    SIGNIFICANCE STATEMENTAnthropologists have called the transition from early to middle childhood the “age of reason”, when children across cultures become more independent. We employ cutting-edge neuroimaging acquisition and analysis approaches to investigate associations between age and functional brain architecture in childhood. Age was positively associated with segregation between cortical systems that process the external world and those that process abstract phenomena like the past, future, and minds of others. Surprisingly, we observed pronounced development at both ends of the sensory-association gradient, challenging the theory that sensory areas develop first and association areas develop last. Our results open new directions for research into how brains reorganize to support rapid gains in cognitive and socioemotional skills as children reach the age of reason.

     
    more » « less
  2. Category selectivity is a fundamental principle of organization of perceptual brain regions. Human occipitotemporal cortex is subdivided into areas that respond preferentially to faces, bodies, artifacts, and scenes. However, observers need to combine information about objects from different categories to form a coherent understanding of the world. How is this multicategory information encoded in the brain? Studying the multivariate interactions between brain regions of male and female human subjects with fMRI and artificial neural networks, we found that the angular gyrus shows joint statistical dependence with multiple category-selective regions. Adjacent regions show effects for the combination of scenes and each other category, suggesting that scenes provide a context to combine information about the world. Additional analyses revealed a cortical map of areas that encode information across different subsets of categories, indicating that multicategory information is not encoded in a single centralized location, but in multiple distinct brain regions.

    SIGNIFICANCE STATEMENTMany cognitive tasks require combining information about entities from different categories. However, visual information about different categorical objects is processed by separate, specialized brain regions. How is the joint representation from multiple category-selective regions implemented in the brain? Using fMRI movie data and state-of-the-art multivariate statistical dependence based on artificial neural networks, we identified the angular gyrus encoding responses across face-, body-, artifact-, and scene-selective regions. Further, we showed a cortical map of areas that encode information across different subsets of categories. These findings suggest that multicategory information is not encoded in a single centralized location, but at multiple cortical sites which might contribute to distinct cognitive functions, offering insights to understand integration in a variety of domains. 

    more » « less
  3. Abstract Research Highlights

    Children with reading disabilities (RD) frequently have a co‐occurring math disability (MD), but the mechanisms behind this high comorbidity are not well understood.

    We examined differences in phonological awareness, reading skills, and executive function between children with RD only versus co‐occurring RD+MD using behavioral and fMRI measures.

    Children with RD only versus RD+MD did not differ in their phonological processing, either behaviorally or in the brain.

    RD+MD was associated with additional behavioral difficulties in working memory, and reduced visual cortex activation during a visuospatial working memory task.

     
    more » « less
  4. Neuroimaging studies of human memory have consistently found that univariate responses in parietal cortex track episodic experience with stimuli (whether stimuli are 'old' or 'new'). More recently, pattern-based fMRI studies have shown that parietal cortex also carries information about the semantic content of remembered experiences. However, it is not well understood how memory-based and content-based signals are integrated within parietal cortex. Here, in humans (males and females), we used voxel-wise encoding models and a recognition memory task to predict the fMRI activity patterns evoked by complex natural scene images based on (1) the episodic history and (2) the semantic content of each image. Models were generated and compared across distinct subregions of parietal cortex and for occipitotemporal cortex. We show that parietal and occipitotemporal regions each encode memory and content information, but they differ in how they combine this information. Among parietal subregions, angular gyrus was characterized by robust and overlapping effects of memory and content. Moreover, subject-specific semantic tuning functions revealed that successful recognition shifted the amplitude of tuning functions in angular gyrus but did not change the selectivity of tuning. In other words, effects of memory and content were additive in angular gyrus. This pattern of data contrasted with occipitotemporal cortex where memory and content effects were interactive: memory effects were preferentially expressed by voxels tuned to the content of a remembered image. Collectively, these findings provide unique insight into how parietal cortex combines information about episodic memory and semantic content.

    SIGNIFICANCE STATEMENTNeuroimaging studies of human memory have identified multiple brain regions that not only carry information about “whether” a visual stimulus is successfully recognized but also “what” the content of that stimulus includes. However, a fundamental and open question concerns how the brain integrates these two types of information (memory and content). Here, using a powerful combination of fMRI analysis methods, we show that parietal cortex, particularly the angular gyrus, robustly combines memory- and content-related information, but these two forms of information are represented via additive, independent signals. In contrast, memory effects in high-level visual cortex critically depend on (and interact with) content representations. Together, these findings reveal multiple and distinct ways in which the brain combines memory- and content-related information.

     
    more » « less
  5. As augmented and virtual reality (AR/VR) technology matures, a method is desired to represent real-world persons visually and aurally in a virtual scene with high fidelity to craft an immersive and realistic user experience. Current technologies leverage camera and depth sensors to render visual representations of subjects through avatars, and microphone arrays are employed to localize and separate high-quality subject audio through beamforming. However, challenges remain in both realms. In the visual domain, avatars can only map key features (e.g., pose, expression) to a predetermined model, rendering them incapable of capturing the subjects’ full details. Alternatively, high-resolution point clouds can be utilized to represent human subjects. However, such three-dimensional data is computationally expensive to process. In the realm of audio, sound source separation requires prior knowledge of the subjects’ locations. However, it may take unacceptably long for sound source localization algorithms to provide this knowledge, which can still be error-prone, especially with moving objects. These challenges make it difficult for AR systems to produce real-time, high-fidelity representations of human subjects for applications such as AR/VR conferencing that mandate negligible system latency. We present Acuity, a real-time system capable of creating high-fidelity representations of human subjects in a virtual scene both visually and aurally. Acuity isolates subjects from high-resolution input point clouds. It reduces the processing overhead by performing background subtraction at a coarse resolution, then applying the detected bounding boxes to fine-grained point clouds. Meanwhile, Acuity leverages an audiovisual sensor fusion approach to expedite sound source separation. The estimated object location in the visual domain guides the acoustic pipeline to isolate the subjects’ voices without running sound source localization. Our results demonstrate that Acuity can isolate multiple subjects’ high-quality point clouds with a maximum latency of 70 ms and average throughput of over 25 fps, while separating audio in less than 30 ms. We provide the source code of Acuity at: https://github.com/nesl/Acuity. 
    more » « less