skip to main content

Title: Spontaneous perception: a framework for task-free, self-paced perception
Abstract Flipping through social media feeds, viewing exhibitions in a museum, or walking through the botanical gardens, people consistently choose to engage with and disengage from visual content. Yet, in most laboratory settings, the visual stimuli, their presentation duration, and the task at hand are all controlled by the researcher. Such settings largely overlook the spontaneous nature of human visual experience, in which perception takes place independently from specific task constraints and its time course is determined by the observer as a self-governing agent. Currently, much remains unknown about how spontaneous perceptual experiences unfold in the brain. Are all perceptual categories extracted during spontaneous perception? Does spontaneous perception inherently involve volition? Is spontaneous perception segmented into discrete episodes? How do different neural networks interact over time during spontaneous perception? These questions are imperative to understand our conscious visual experience in daily life. In this article we propose a framework for spontaneous perception. We first define spontaneous perception as a task-free and self-paced experience. We propose that spontaneous perception is guided by four organizing principles that grant it temporal and spatial structures. These principles include coarse-to-fine processing, continuity and segmentation, agency and volition, and associative processing. We provide key suggestions illustrating more » how these principles may interact with one another in guiding the multifaceted experience of spontaneous perception. We point to testable predictions derived from this framework, including (but not limited to) the roles of the default-mode network and slow cortical potentials in underlying spontaneous perception. We conclude by suggesting several outstanding questions for future research, extending the relevance of this framework to consciousness and spontaneous brain activity. In conclusion, the spontaneous perception framework proposed herein integrates components in human perception and cognition, which have been traditionally studied in isolation, and opens the door to understand how visual perception unfolds in its most natural context. « less
Award ID(s):
1753218 1926780
Publication Date:
Journal Name:
Neuroscience of Consciousness
Sponsoring Org:
National Science Foundation
More Like this
  1. Cross-modal effects provide a model framework for investigating hierarchical inter-areal processing, particularly, under conditions where unimodal cortical areas receive contextual feedback from other modalities. Here, using complementary behavioral and brain imaging techniques, we investigated the functional networks participating in face and voice processing during gender perception, a high-level feature of voice and face perception. Within the framework of a signal detection decision model, Maximum likelihood conjoint measurement (MLCM) was used to estimate the contributions of the face and voice to gender comparisons between pairs of audio-visual stimuli in which the face and voice were independently modulated. Top–down contributions were varied by instructing participants to make judgments based on the gender of either the face, the voice or both modalities ( N = 12 for each task). Estimated face and voice contributions to the judgments of the stimulus pairs were not independent; both contributed to all tasks, but their respective weights varied over a 40-fold range due to top–down influences. Models that best described the modal contributions required the inclusion of two different top–down interactions: (i) an interaction that depended on gender congruence across modalities (i.e., difference between face and voice modalities for each stimulus); (ii) an interaction that depended onmore »the within modalities’ gender magnitude. The significance of these interactions was task dependent. Specifically, gender congruence interaction was significant for the face and voice tasks while the gender magnitude interaction was significant for the face and stimulus tasks. Subsequently, we used the same stimuli and related tasks in a functional magnetic resonance imaging (fMRI) paradigm ( N = 12) to explore the neural correlates of these perceptual processes, analyzed with Dynamic Causal Modeling (DCM) and Bayesian Model Selection. Results revealed changes in effective connectivity between the unimodal Fusiform Face Area (FFA) and Temporal Voice Area (TVA) in a fashion that paralleled the face and voice behavioral interactions observed in the psychophysical data. These findings explore the role in perception of multiple unimodal parallel feedback pathways.« less
  2. Abstract Impossible figures represent the world in ways it cannot be. From the work of M. C. Escher to any popular perception textbook, such experiences show how some principles of mental processing can be so entrenched and inflexible as to produce absurd and even incoherent outcomes that could not occur in reality. Surprisingly, however, such impossible experiences are mostly limited to visual perception; are there “impossible figures” for other sensory modalities? Here, we import a known magic trick into the laboratory to report and investigate an impossible somatosensory experience—one that can be physically felt. We show that, even under full-cue conditions with objects that can be freely inspected, subjects can be made to experience a single object alone as feeling heavier than a group of objects that includes the single object as a member—an impossible and phenomenologically striking experience of weight. Moreover, we suggest that this phenomenon—a special case of the size-weight illusion—reflects a kind of “anti-Bayesian” perceptual updating that amplifies a challenge to rational models of perception and cognition. Impossibility can not only be seen, but also felt—and in ways that matter for accounts of (ir)rational mental processing.
  3. Serial dependence—an attractive perceptual bias whereby a current stimulus is perceived to be similar to previously seen ones—is thought to represent the process that facilitates the stability and continuity of visual perception. Recent results demonstrate a neural signature of serial dependence in numerosity perception, emerging very early in the time course during perceptual processing. However, whether such a perceptual signature is retained after the initial processing remains unknown. Here, we address this question by investigating the neural dynamics of serial dependence using a recently developed technique that allowed a reactivation of hidden memory states. Participants performed a numerosity discrimination task during EEG recording, with task-relevant dot array stimuli preceded by a task-irrelevant stimulus inducing serial dependence. Importantly, the neural network storing the representation of the numerosity stimulus was perturbed (or pinged) so that the hidden states of that representation can be explicitly quantified. The results first show that a neural signature of serial dependence emerges early in the brain signals, starting soon after stimulus onset. Critical to the central question, the pings at a later latency could successfully reactivate the biased representation of the initial stimulus carrying the signature of serial dependence. These results provide one of the first piecesmore »of empirical evidence that the biased neural representation of a stimulus initially induced by serial dependence is preserved throughout a relatively long period.« less
  4. Arousal levels perpetually rise and fall spontaneously. How markers of arousal—pupil size and frequency content of brain activity—relate to each other and influence behavior in humans is poorly understood. We simultaneously monitored magnetoencephalography and pupil in healthy volunteers at rest and during a visual perceptual decision-making task. Spontaneously varying pupil size correlates with power of brain activity in most frequency bands across large-scale resting state cortical networks. Pupil size recorded at prestimulus baseline correlates with subsequent shifts in detection bias ( c ) and sensitivity ( d ’). When dissociated from pupil-linked state, prestimulus spectral power of resting state networks still predicts perceptual behavior. Fast spontaneous pupil constriction and dilation correlate with large-scale brain activity as well but not perceptual behavior. Our results illuminate the relation between central and peripheral arousal markers and their respective roles in human perceptual decision-making.
  5. While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, the most widely used perceptual metrics today, such as PSNR and SSIM, are simple, shallow functions, and fail to account for many nuances of human perception. Recently, the deep learning community has found that features of the VGG network trained on ImageNet classification has been remarkably useful as a training loss for image synthesis. But how perceptual are these so-called "perceptual losses"? What elements are critical for their success? To answer these questions, we introduce a new dataset of human perceptual similarity judgments. We systematically evaluate deep features across different architectures and tasks and compare them with classic metrics. We find that deep features outperform all previous metrics by large margins on our dataset. More surprisingly, this result is not restricted to ImageNet-trained VGG features, but holds across different deep architectures and levels of supervision (supervised, self-supervised, or even unsupervised). Our results suggest that perceptual similarity is an emergent property shared across deep visual representations.