skip to main content


Title: Semantic novelty modulates neural responses to visual change across the human brain
Abstract

Our continuous visual experience in daily life is dominated by change. Previous research has focused on visual change due to stimulus motion, eye movements or unfolding events, but not their combined impact across the brain, or their interactions with semantic novelty. We investigate the neural responses to these sources of novelty during film viewing. We analyzed intracranial recordings in humans across 6328 electrodes from 23 individuals. Responses associated with saccades and film cuts were dominant across the entire brain. Film cuts at semantic event boundaries were particularly effective in the temporal and medial temporal lobe. Saccades to visual targets with high visual novelty were also associated with strong neural responses. Specific locations in higher-order association areas showed selectivity to either high or low-novelty saccades. We conclude that neural activity associated with film cuts and eye movements is widespread across the brain and is modulated by semantic novelty.

 
more » « less
Award ID(s):
2201835
NSF-PAR ID:
10415066
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Nature Communications
Volume:
14
Issue:
1
ISSN:
2041-1723
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Decades of research have shown that global brain states such as arousal can be indexed by measuring the properties of the eyes. The spiking responses of neurons throughout the brain have been associated with the pupil, small fixational saccades, and vigor in eye movements, but it has been difficult to isolate how internal states affect the eyes, and vice versa. While recording from populations of neurons in the visual and prefrontal cortex (PFC), we recently identified a latent dimension of neural activity called “slow drift,” which appears to reflect a shift in a global brain state. Here, we asked if slow drift is correlated with the action of the eyes in distinct behavioral tasks. We recorded from visual cortex (V4) while monkeys performed a change detection task, and PFC, while they performed a memory-guided saccade task. In both tasks, slow drift was associated with the size of the pupil and the microsaccade rate, two external indicators of the internal state of the animal. These results show that metrics related to the action of the eyes are associated with a dominant and task-independent mode of neural activity that can be accessed in the population activity of neurons across the cortex. 
    more » « less
  2. Abstract

    Saccadic eye movements (saccades) disrupt the continuous flow of visual information, yet our perception of the visual world remains uninterrupted. Here we assess the representation of the visual scene across saccades from single-trial spike trains of extrastriate visual areas, using a combined electrophysiology and statistical modeling approach. Using a model-based decoder we generate a high temporal resolution readout of visual information, and identify the specific changes in neurons’ spatiotemporal sensitivity that underly an integrated perisaccadic representation of visual space. Our results show that by maintaining a memory of the visual scene, extrastriate neurons produce an uninterrupted representation of the visual world. Extrastriate neurons exhibit a late response enhancement close to the time of saccade onset, which preserves the latest pre-saccadic information until the post-saccadic flow of retinal information resumes. These results show how our brain exploits available information to maintain a representation of the scene while visual inputs are disrupted.

     
    more » « less
  3. Oh, A ; Naumann, T ; Globerson, A ; Saenko, K ; Hardt, M ; Levine, S (Ed.)
    The human visual system uses two parallel pathways for spatial processing and object recognition. In contrast, computer vision systems tend to use a single feedforward pathway, rendering them less robust, adaptive, or efficient than human vision. To bridge this gap, we developed a dual-stream vision model inspired by the human eyes and brain. At the input level, the model samples two complementary visual patterns to mimic how the human eyes use magnocellular and parvocellular retinal ganglion cells to separate retinal inputs to the brain. At the backend, the model processes the separate input patterns through two branches of convolutional neural networks (CNN) to mimic how the human brain uses the dorsal and ventral cortical pathways for parallel visual processing. The first branch (WhereCNN) samples a global view to learn spatial attention and control eye movements. The second branch (WhatCNN) samples a local view to represent the object around the fixation. Over time, the two branches interact recurrently to build a scene representation from moving fixations. We compared this model with the human brains processing the same movie and evaluated their functional alignment by linear transformation. The WhereCNN and WhatCNN branches were found to differentially match the dorsal and ventral pathways of the visual cortex, respectively, primarily due to their different learning objectives, rather than their distinctions in retinal sampling or sensitivity to attention-driven eye movements. These model-based results lead us to speculate that the distinct responses and representations of the ventral and dorsal streams are more influenced by their distinct goals in visual attention and object recognition than by their specific bias or selectivity in retinal inputs. This dual-stream model takes a further step in brain-inspired computer vision, enabling parallel neural networks to actively explore and understand the visual surroundings. 
    more » « less
  4. Abstract

    Fixational eye movements alter the number and timing of spikes transmitted from the retina to the brain, but whether these changes enhance or degrade the retinal signal is unclear. To quantify this, we developed a Bayesian method for reconstructing natural images from the recorded spikes of hundreds of retinal ganglion cells (RGCs) in the macaque retina (male), combining a likelihood model for RGC light responses with the natural image prior implicitly embedded in an artificial neural network optimized for denoising. The method matched or surpassed the performance of previous reconstruction algorithms, and provides an interpretable framework for characterizing the retinal signal. Reconstructions were improved with artificial stimulus jitter that emulated fixational eye movements, even when the eye movement trajectory was assumed to be unknown and had to be inferred from retinal spikes. Reconstructions were degraded by small artificial perturbations of spike times, revealing more precise temporal encoding than suggested by previous studies. Finally, reconstructions were substantially degraded when derived from a model that ignored cell-to-cell interactions, indicating the importance of stimulus-evoked correlations. Thus, fixational eye movements enhance the precision of the retinal representation.

     
    more » « less
  5. Abstract

    Facial motion is a primary source of social information about other humans. Prior fMRI studies have identified regions of the superior temporal sulcus (STS) that respond specifically to perceived face movements (termed fSTS), but little is known about the nature of motion representations in these regions. Here we use fMRI and multivoxel pattern analysis to characterize the representational content of the fSTS. Participants viewed a set of specific eye and mouth movements, as well as combined eye and mouth movements. Our results demonstrate that fSTS response patterns contain information about face movements, including subtle distinctions between types of eye and mouth movements. These representations generalize across the actor performing the movement, and across small differences in visual position. Critically, patterns of response to combined movements could be well predicted by linear combinations of responses to individual eye and mouth movements, pointing to a parts‐based representation of complex face movements. These results indicate that the fSTS plays an intermediate role in the process of inferring social content from visually perceived face movements, containing a representation that is sufficiently abstract to generalize across low‐level visual details, but still tied to the kinematics of face part movements.

     
    more » « less