Abstract It has been debated whether salient distractors in visual search can be proactively suppressed to completely prevent attentional capture, as the occurrence of proactive suppression implies that the initial shift of attention is not entirely driven by physical salience. While the presence of a Pd component in the EEG (associated with suppression) without a preceding N2pc component (associated with selection) has been used as evidence for proactive suppression, the link between these ERPs and the underlying mechanisms is not always clear. This is exemplified in two recent articles that observed the same waveform pattern, where an early Pd-like component flipped to a N2pc-like component, but provided vastly different interpretations (Drisdelle, B. L., & Eimer, E. PD components and distractor inhibition in visual search: New evidence for the signal suppression hypothesis. Psychophysiology, 58, e13898, 2021; Kerzel, D., & Burra, N. Capture by context elements, not attentional suppression of distractors, explains the PD with small search displays. Journal of Cognitive Neuroscience, 32, 1170–1183, 2020). Using RAGNAROC (Wyble et al., Understanding visual attention with RAGNAROC: A Reflexive Attention Gradient through Neural AttRactOr Competition. Psychological Review, 127, 1163–1198, 2020), a computational model of reflexive attention, we successfully simulated this ERP pattern with minimal changes to its existing architecture, providing a parsimonious and mechanistic explanation for this flip in the EEG that is unique from both of the previous interpretations. Our account supports the occurrence of proactive suppression and demonstrates the benefits of incorporating computational modeling into theory building.
more »
« less
Same-different conceptualization: a machine vision perspective
The goal of this review is to bring together material from cognitive psychology with recent machine vision studies to identify plausible neural mechanisms for visual same-different discrimination and relational understanding. We highlight how developments in the study of artificial neural networks provide computational evidence implicating attention and working memory in the ascertaining of visual relations, including same- different relations. We review some recent attempts to incorporate these mechanisms into flexible models of visual reasoning. Particular attention is given to recent models jointly trained on visual and linguistic information. These recent systems are promising, but they still fall short of the biological standard in several ways, which we outline in a final section.
more »
« less
- PAR ID:
- 10205786
- Date Published:
- Journal Name:
- Current opinion in behavioral sciences
- Volume:
- 37
- ISSN:
- 2352-1546
- Page Range / eLocation ID:
- 47-55
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)The development of deep convolutional neural networks (CNNs) has recently led to great successes in computer vision and CNNs have become de facto computational models of vision. However, a growing body of work suggests that they exhibit critical limitations beyond image categorization. Here, we study one such fundamental limitation, for judging whether two simultaneously presented items are the same or different (SD) compared to a baseline assessment of their spatial relationship (SR). In both human subjects and artificial neural networks, we test the prediction that SD tasks recruit additional cortical mechanisms which underlie critical aspects of visual cognition that are not explained by current computational models. We thus recorded EEG signals from human participants engaged in the same tasks as the computational models. Importantly, in humans the two tasks were matched in terms of difficulty by an adaptive psychometric procedure: yet, on top of a modulation of evoked potentials, our results revealed higher activity in the low beta (16-24Hz) band in the SD compared to the SR conditions. We surmise that these oscillations reflect the crucial involvement of additional mechanisms, such as working memory and attention, which are missing in current feed-forward CNNs.more » « less
-
Recent advances in convolutional neural network (CNN) model interpretability have led to impressive progress in vi- sualizing and understanding model predictions. In partic- ular, gradient-based visual attention methods have driven much recent effort in using visual attention maps as a means for visual explanations. A key problem, however, is these methods are designed for classification and categorization tasks, and their extension to explaining generative models, e.g., variational autoencoders (VAE) is not trivial. In this work, we take a step towards bridging this crucial gap, proposing the first technique to visually explain VAEs by means of gradient-based attention. We present methods to generate visual attention from the learned latent space, and also demonstrate such attention explanations serve more than just explaining VAE predictions. We show how these attention maps can be used to localize anomalies in images, demonstrating state-of-the-art performance on the MVTec- AD dataset. We also show how they can be infused into model training, helping bootstrap the VAE into learning im- proved latent space disentanglement, demonstrated on the Dsprites dataset.more » « less
-
Recent advances in Convolutional Neural Network (CNN) model interpretability have led to impressive progress in visualizing and understanding model predictions. In particular, gradient-based visual attention methods have driven much recent effort in using visual attention maps as a means for visual explanations. A key problem, however, is these methods are designed for classification and categorization tasks, and their extension to explaining generative models, e.g., variational autoencoders (VAE) is not trivial. In this work, we take a step towards bridging this crucial gap, proposing the first technique to visually explain VAEs by means of gradient-based attention. We present methods to generate visual attention from the learned latent space, and also demonstrate such attention explanations serve more than just explaining VAE predictions. We show how these attention maps can be used to localize anomalies in images, demonstrating state-of-the-art performance on the MVTec-AD dataset. We also show how they can be infused into model training, helping bootstrap the VAE into learning improved latent space disentanglement, demonstrated on the Dsprites dataset.more » « less
-
Abstract Speech processing often occurs amid competing inputs from other modalities, for example, listening to the radio while driving. We examined the extent to which dividing attention between auditory and visual modalities (bimodal divided attention) impacts neural processing of natural continuous speech from acoustic to linguistic levels of representation. We recorded electroencephalographic (EEG) responses when human participants performed a challenging primary visual task, imposing low or high cognitive load while listening to audiobook stories as a secondary task. The two dual-task conditions were contrasted with an auditory single-task condition in which participants attended to stories while ignoring visual stimuli. Behaviorally, the high load dual-task condition was associated with lower speech comprehension accuracy relative to the other two conditions. We fitted multivariate temporal response function encoding models to predict EEG responses from acoustic and linguistic speech features at different representation levels, including auditory spectrograms and information-theoretic models of sublexical-, word-form-, and sentence-level representations. Neural tracking of most acoustic and linguistic features remained unchanged with increasing dual-task load, despite unambiguous behavioral and neural evidence of the high load dual-task condition being more demanding. Compared to the auditory single-task condition, dual-task conditions selectively reduced neural tracking of only some acoustic and linguistic features, mainly at latencies >200 ms, while earlier latencies were surprisingly unaffected. These findings indicate that behavioral effects of bimodal divided attention on continuous speech processing occur not because of impaired early sensory representations but likely at later cognitive processing stages. Crossmodal attention-related mechanisms may not be uniform across different speech processing levels.more » « less
An official website of the United States government

