skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, December 13 until 2:00 AM ET on Saturday, December 14 due to maintenance. We apologize for the inconvenience.


Title: General object-based features account for letter perception
After years of experience, humans become experts at perceiving letters. Is this visual capacity attained by learning specialized letter features, or by reusing general visual features previously learned in service of object categorization? To explore this question, we first measured the perceptual similarity of letters in two behavioral tasks, visual search and letter categorization. Then, we trained deep convolutional neural networks on either 26-way letter categorization or 1000-way object categorization, as a way to operationalize possible specialized letter features and general object-based features, respectively. We found that the general object-based features more robustly correlated with the perceptual similarity of letters. We then operationalized additional forms of experience-dependent letter specialization by altering object-trained networks with varied forms of letter training; however, none of these forms of letter specialization improved the match to human behavior. Thus, our findings reveal that it is not necessary to appeal to specialized letter representations to account for perceptual similarity of letters. Instead, we argue that it is more likely that the perception of letters depends on domain-general visual features.  more » « less
Award ID(s):
1942438
PAR ID:
10421731
Author(s) / Creator(s):
; ; ;
Editor(s):
Isik, Leyla
Date Published:
Journal Name:
PLOS Computational Biology
Volume:
18
Issue:
9
ISSN:
1553-7358
Page Range / eLocation ID:
e1010522
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Letter position coding in word recognition has been widely investigated in the visual modality (e.g., labotarory is confusable with laboratory), but not as much in the tactile modality using braille, leading to an incomplete understanding of whether this process is modality-dependent. Unlike sighted readers, braille readers do not show a transposed-letter similarity effect with nonadjacent transpositions (e.g., labotarory = labodanory; Perea et al., 2012). While this latter finding was taken to suggest that the flexibility in letter position coding was due to visual factors (e.g., perceptual uncertainty in the location of visual objects (letters)), it is necessary to test whether transposed-letter effects occur with adjacent letters to reach firm conclusions. Indeed, in the auditory modality (i.e., another serial modality), a transposed-phoneme effect occurs for adjacent but not for nonadjacent transpositions. In a lexical decision task, we examined whether pseudowords created by transposing two adjacent letters of a word (e.g., laboartory) are more confusable with their base word (laboratory) than pseudowords created by replacing those letters (laboestory) in braille. Results showed that transposed-letter pseudowords produced more errors and slower responses than the orthographic controls. Thus, these findings suggest that the mechanism of serial order, while universal, can be shaped by the sensory modality at play. 
    more » « less
  2. Abstract

    Research at the intersection of computer vision and neuroscience has revealed hierarchical correspondence between layers of deep convolutional neural networks (DCNNs) and cascade of regions along human ventral visual cortex. Recently, studies have uncovered emergence of human interpretable concepts within DCNNs layers trained to identify visual objects and scenes. Here, we asked whether an artificial neural network (with convolutional structure) trained for visual categorization would demonstrate spatial correspondences with human brain regions showing central/peripheral biases. Using representational similarity analysis, we compared activations of convolutional layers of a DCNN trained for object and scene categorization with neural representations in human brain visual regions. Results reveal a brain-like topographical organization in the layers of the DCNN, such that activations of layer-units with central-bias were associated with brain regions with foveal tendencies (e.g. fusiform gyrus), and activations of layer-units with selectivity for image backgrounds were associated with cortical regions showing peripheral preference (e.g. parahippocampal cortex). The emergence of a categorical topographical correspondence between DCNNs and brain regions suggests these models are a good approximation of the perceptual representation generated by biological neural networks.

     
    more » « less
  3. Abstract

    The grouping of sensory stimuli into categories is fundamental to cognition. Previous research in the visual and auditory systems supports a two‐stage processing hierarchy that underlies perceptual categorization: (a) a “bottom‐up” perceptual stage in sensory cortices where neurons show selectivity for stimulus features and (b) a “top‐down” second stage in higher level cortical areas that categorizes the stimulus‐selective input from the first stage. In order to test the hypothesis that the two‐stage model applies to the somatosensory system, 14 human participants were trained to categorize vibrotactile stimuli presented to their right forearm. Then, during an fMRI scan, participants actively categorized the stimuli. Representational similarity analysis revealed stimulus selectivity in areas including the left precentral and postcentral gyri, the supramarginal gyrus, and the posterior middle temporal gyrus. Crucially, we identified a single category‐selective region in the left ventral precentral gyrus. Furthermore, an estimation of directed functional connectivity delivered evidence for robust top‐down connectivity from the second to first stage. These results support the validity of the two‐stage model of perceptual categorization for the somatosensory system, suggesting common computational principles and a unified theory of perceptual categorization across the visual, auditory, and somatosensory systems.

     
    more » « less
  4. Abstract Introduction

    How do multiple sources of information interact to form mental representations of object categories? It is commonly held that object categories reflect the integration of perceptual features and semantic/knowledge‐based features. To explore the relative contributions of these two sources of information, we used functional magnetic resonance imaging (fMRI) to identify regions involved in the representation object categories with shared visual and/or semantic features.

    Methods

    Participants (N = 20) viewed a series of objects that varied in their degree of visual and semantic overlap in the MRI scanner. We used a blocked adaptation design to identify sensitivity to visual and semantic features in a priori visual processing regions and in a distributed network of object processing regions with an exploratory whole‐brain analysis.

    Results

    Somewhat surprisingly, within higher‐order visual processing regions—specifically lateral occipital cortex (LOC)—we did not obtain any difference in neural adaptation for shared visual versus semantic category membership. More broadly, both visual and semantic information affected a distributed network of independently identified category‐selective regions. Adaptation was seen a whole‐brain network of processing regions in response to visual similarity and semantic similarity; specifically, the angular gyrus (AnG) adapted to visual similarity and the dorsomedial prefrontal cortex (DMPFC) adapted to both visual and semantic similarity.

    Conclusions

    Our findings suggest that perceptual features help organize mental categories throughout the object processing hierarchy. Most notably, visual similarity also influenced adaptation in nonvisual brain regions (i.e., AnG and DMPFC). We conclude that category‐relevant visual features are maintained in higher‐order conceptual representations and visual information plays an important role in both the acquisition and neural representation of conceptual object categories.

     
    more » « less
  5. While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, the most widely used perceptual metrics today, such as PSNR and SSIM, are simple, shallow functions, and fail to account for many nuances of human perception. Recently, the deep learning community has found that features of the VGG network trained on ImageNet classification has been remarkably useful as a training loss for image synthesis. But how perceptual are these so-called "perceptual losses"? What elements are critical for their success? To answer these questions, we introduce a new dataset of human perceptual similarity judgments. We systematically evaluate deep features across different architectures and tasks and compare them with classic metrics. We find that deep features outperform all previous metrics by large margins on our dataset. More surprisingly, this result is not restricted to ImageNet-trained VGG features, but holds across different deep architectures and levels of supervision (supervised, self-supervised, or even unsupervised). Our results suggest that perceptual similarity is an emergent property shared across deep visual representations. 
    more » « less