skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 10:00 PM ET on Friday, December 8 until 2:00 AM ET on Saturday, December 9 due to maintenance. We apologize for the inconvenience.

Title: Spontaneous perception: a framework for task-free, self-paced perception
Abstract Flipping through social media feeds, viewing exhibitions in a museum, or walking through the botanical gardens, people consistently choose to engage with and disengage from visual content. Yet, in most laboratory settings, the visual stimuli, their presentation duration, and the task at hand are all controlled by the researcher. Such settings largely overlook the spontaneous nature of human visual experience, in which perception takes place independently from specific task constraints and its time course is determined by the observer as a self-governing agent. Currently, much remains unknown about how spontaneous perceptual experiences unfold in the brain. Are all perceptual categories extracted during spontaneous perception? Does spontaneous perception inherently involve volition? Is spontaneous perception segmented into discrete episodes? How do different neural networks interact over time during spontaneous perception? These questions are imperative to understand our conscious visual experience in daily life. In this article we propose a framework for spontaneous perception. We first define spontaneous perception as a task-free and self-paced experience. We propose that spontaneous perception is guided by four organizing principles that grant it temporal and spatial structures. These principles include coarse-to-fine processing, continuity and segmentation, agency and volition, and associative processing. We provide key suggestions illustrating how these principles may interact with one another in guiding the multifaceted experience of spontaneous perception. We point to testable predictions derived from this framework, including (but not limited to) the roles of the default-mode network and slow cortical potentials in underlying spontaneous perception. We conclude by suggesting several outstanding questions for future research, extending the relevance of this framework to consciousness and spontaneous brain activity. In conclusion, the spontaneous perception framework proposed herein integrates components in human perception and cognition, which have been traditionally studied in isolation, and opens the door to understand how visual perception unfolds in its most natural context.  more » « less
Award ID(s):
1753218 1926780
Author(s) / Creator(s):
Date Published:
Journal Name:
Neuroscience of Consciousness
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In bistable perception, observers experience alternations between two interpretations of an unchanging stimulus. Neurophysiological studies of bistable perception typically partition neural measurements into stimulus-based epochs and assess neuronal differences between epochs based on subjects' perceptual reports. Computational studies replicate statistical properties of percept durations with modeling principles like competitive attractors or Bayesian inference. However, bridging neuro-behavioral findings with modeling theory requires the analysis of single-trial dynamic data. Here, we propose an algorithm for extracting nonstationary timeseries features from single-trial electrocorticography (ECoG) data. We applied the proposed algorithm to 5-min ECoG recordings from human primary auditory cortex obtained during perceptual alternations in an auditory triplet streaming task (six subjects: four male, two female). We report two ensembles of emergent neuronal features in all trial blocks. One ensemble consists of periodic functions that encode a stereotypical response to the stimulus. The other comprises more transient features and encodes dynamics associated with bistable perception at multiple time scales: minutes (within-trial alternations), seconds (duration of individual percepts), and milliseconds (switches between percepts). Within the second ensemble, we identified a slowly drifting rhythm that correlates with the perceptual states and several oscillators with phase shifts near perceptual switches. Projections of single-trial ECoG data onto these features establish low-dimensional attractor-like geometric structures invariant across subjects and stimulus types. These findings provide supporting neural evidence for computational models with oscillatory-driven attractor-based principles. The feature extraction techniques described here generalize across recording modality and are appropriate when hypothesized low-dimensional dynamics characterize an underlying neural system.

    SIGNIFICANCE STATEMENTIrrespective of the sensory modality, neurophysiological studies of multistable perception have typically investigated events time-locked to the perceptual switching rather than the time course of the perceptual states per se. Here, we propose an algorithm that extracts neuronal features of bistable auditory perception from largescale single-trial data while remaining agnostic to the subject's perceptual reports. The algorithm captures the dynamics of perception at multiple timescales, minutes (within-trial alternations), seconds (durations of individual percepts), and milliseconds (timing of switches), and distinguishes attributes of neural encoding of the stimulus from those encoding the perceptual states. Finally, our analysis identifies a set of latent variables that exhibit alternating dynamics along a low-dimensional manifold, similar to trajectories in attractor-based models for perceptual bistability.

    more » « less
  2. Cross-modal effects provide a model framework for investigating hierarchical inter-areal processing, particularly, under conditions where unimodal cortical areas receive contextual feedback from other modalities. Here, using complementary behavioral and brain imaging techniques, we investigated the functional networks participating in face and voice processing during gender perception, a high-level feature of voice and face perception. Within the framework of a signal detection decision model, Maximum likelihood conjoint measurement (MLCM) was used to estimate the contributions of the face and voice to gender comparisons between pairs of audio-visual stimuli in which the face and voice were independently modulated. Top–down contributions were varied by instructing participants to make judgments based on the gender of either the face, the voice or both modalities ( N = 12 for each task). Estimated face and voice contributions to the judgments of the stimulus pairs were not independent; both contributed to all tasks, but their respective weights varied over a 40-fold range due to top–down influences. Models that best described the modal contributions required the inclusion of two different top–down interactions: (i) an interaction that depended on gender congruence across modalities (i.e., difference between face and voice modalities for each stimulus); (ii) an interaction that depended on the within modalities’ gender magnitude. The significance of these interactions was task dependent. Specifically, gender congruence interaction was significant for the face and voice tasks while the gender magnitude interaction was significant for the face and stimulus tasks. Subsequently, we used the same stimuli and related tasks in a functional magnetic resonance imaging (fMRI) paradigm ( N = 12) to explore the neural correlates of these perceptual processes, analyzed with Dynamic Causal Modeling (DCM) and Bayesian Model Selection. Results revealed changes in effective connectivity between the unimodal Fusiform Face Area (FFA) and Temporal Voice Area (TVA) in a fashion that paralleled the face and voice behavioral interactions observed in the psychophysical data. These findings explore the role in perception of multiple unimodal parallel feedback pathways. 
    more » « less
  3. To fluidly engage with the world, our brains must simultaneously represent both the scene in front of us and our memory of the immediate surrounding environment (i.e., local visuospatial context). How does the brain's functional architecture enable sensory and mnemonic representations to closely interface while also avoiding sensory-mnemonic interference? Here, we asked this question using first-person, head-mounted virtual reality and fMRI. Using virtual reality, human participants of both sexes learned a set of immersive, real-world visuospatial environments in which we systematically manipulated the extent of visuospatial context associated with a scene image in memory across three learning conditions, spanning from a single FOV to a city street. We used individualized, within-subject fMRI to determine which brain areas support memory of the visuospatial context associated with a scene during recall (Experiment 1) and recognition (Experiment 2). Across the whole brain, activity in three patches of cortex was modulated by the amount of known visuospatial context, each located immediately anterior to one of the three scene perception areas of high-level visual cortex. Individual subject analyses revealed that these anterior patches corresponded to three functionally defined place memory areas, which selectively respond when visually recalling personally familiar places. In addition to showing activity levels that were modulated by the amount of visuospatial context, multivariate analyses showed that these anterior areas represented the identity of the specific environment being recalled. Together, these results suggest a convergence zone for scene perception and memory of the local visuospatial context at the anterior edge of high-level visual cortex.

    SIGNIFICANCE STATEMENTAs we move through the world, the visual scene around us is integrated with our memory of the wider visuospatial context. Here, we sought to understand how the functional architecture of the brain enables coexisting representations of the current visual scene and memory of the surrounding environment. Using a combination of immersive virtual reality and fMRI, we show that memory of visuospatial context outside the current FOV is represented in a distinct set of brain areas immediately anterior and adjacent to the perceptually oriented scene-selective areas of high-level visual cortex. This functional architecture would allow efficient interaction between immediately adjacent mnemonic and perceptual areas while also minimizing interference between mnemonic and perceptual representations.

    more » « less
  4. Serial dependence—an attractive perceptual bias whereby a current stimulus is perceived to be similar to previously seen ones—is thought to represent the process that facilitates the stability and continuity of visual perception. Recent results demonstrate a neural signature of serial dependence in numerosity perception, emerging very early in the time course during perceptual processing. However, whether such a perceptual signature is retained after the initial processing remains unknown. Here, we address this question by investigating the neural dynamics of serial dependence using a recently developed technique that allowed a reactivation of hidden memory states. Participants performed a numerosity discrimination task during EEG recording, with task-relevant dot array stimuli preceded by a task-irrelevant stimulus inducing serial dependence. Importantly, the neural network storing the representation of the numerosity stimulus was perturbed (or pinged) so that the hidden states of that representation can be explicitly quantified. The results first show that a neural signature of serial dependence emerges early in the brain signals, starting soon after stimulus onset. Critical to the central question, the pings at a later latency could successfully reactivate the biased representation of the initial stimulus carrying the signature of serial dependence. These results provide one of the first pieces of empirical evidence that the biased neural representation of a stimulus initially induced by serial dependence is preserved throughout a relatively long period. 
    more » « less
  5. Abstract

    Selective attention improves sensory processing of relevant information but can also impact the quality of perception. For example, attention increases visual discrimination performance and at the same time boosts apparent stimulus contrast of attended relative to unattended stimuli. Can attention also lead to perceptual distortions of visual representations? Optimal tuning accounts of attention suggest that processing is biased towards “off-tuned” features to maximize the signal-to-noise ratio in favor of the target, especially when targets and distractors are confusable. Here, we tested whether such tuning gives rise to phenomenological changes of visual features. We instructed participants to select a color among other colors in a visual search display and subsequently asked them to judge the appearance of the target color in a 2-alternative forced choice task. Participants consistently judged the target color to appear more dissimilar from the distractor color in feature space. Critically, the magnitude of these perceptual biases varied systematically with the similarity between target and distractor colors during search, indicating that attentional tuning quickly adapts to current task demands. In control experiments we rule out possible non-attentional explanations such as color contrast or memory effects. Overall, our results demonstrate that selective attention warps the representational geometry of color space, resulting in profound perceptual changes across large swaths of feature space. Broadly, these results indicate that efficient attentional selection can come at a perceptual cost by distorting our sensory experience.

    more » « less