skip to main content

Title: Perceptual learning in correlation estimation: The role of learning category organization
Research has shown that estimation of correlation from scatter plots is done poorly by both novices and experts. We tested whether proficiency in correlation estimation could be improved by perceptual learning interventions, in the form of perceptual-adaptive learning modules (PALMs). We also tested learning effects of alternative category structures in perceptual learning. We organized the same set of 252 scatter plot displays either into a PALM that implemented spacing in learning by shape categories or one in which the categories were ranges of correlation strength. Both PALMs produced markedly reduced errors, and both led trained participants to classify near transfer items as accurately as trained items. Differences in category organization produced modest effects on learning; there was some indication of more consistent reduction of absolute error when learning categories were organized by shape, whereas average bias of judgments was best reduced by categories organized by different numerical ranges of correlation.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Rogers, T. T.; Rau, M.; Zhu, X.; Kalish, C. W.
Date Published:
Journal Name:
Proceedings of the 40th Annual Conference of the Cognitive Science Society
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Category learning and visual perception are fundamentally interactive processes, such that successful categorization often depends on the ability to make fine visual discriminations between stimuli that vary on continuously valued dimensions. Research suggests that category learning can improve perceptual discrimination along the stimulus dimensions that predict category membership and that these perceptual enhancements are a byproduct of functional plasticity in the visual system. However, the precise mechanisms underlying learning-dependent sensory modulation in categorization are not well understood. We hypothesized that category learning leads to a representational sharpening of underlying sensory populations tuned to values at or near the category boundary. Furthermore, such sharpening should occur largely during active learning of new categories. These hypotheses were tested using fMRI and a theoretically constrained model of vision to quantify changes in the shape of orientation representations while human adult subjects learned to categorize physically identical stimuli based on either an orientation rule (N = 12) or an orthogonal spatial frequency rule (N = 13). Consistent with our predictions, modeling results revealed relatively enhanced reconstructed representations of stimulus orientation in visual cortex (V1–V3) only for orientation rule learners. Moreover, these reconstructed representations varied as a function of distance from the category boundary, such that representations for challenging stimuli near the boundary were significantly sharper than those for stimuli at the category centers. These results support an efficient model of plasticity wherein only the sensory populations tuned to the most behaviorally relevant regions of feature space are enhanced during category learning.

    more » « less
  2. null (Ed.)
    Abstract Perception, representation, and memory of ensemble statistics has attracted growing interest. Studies found that, at different abstraction levels, the brain represents similar items as unified percepts. We found that global ensemble perception is automatic and unconscious, affecting later perceptual judgments regarding individual member items. Implicit effects of set mean and range for low-level feature ensembles (size, orientation, brightness) were replicated for high-level category objects. This similarity suggests that analogous mechanisms underlie these extreme levels of abstraction. Here, we bridge the span between visual features and semantic object categories using the identical implicit perception experimental paradigm for intermediate novel visual-shape categories, constructing ensemble exemplars by introducing systematic variations of a central category base or ancestor. In five experiments, with different item variability, we test automatic representation of ensemble category characteristics and its effect on a subsequent memory task. Results show that observer representation of ensembles includes the group’s central shape, category ancestor (progenitor), or group mean. Observers also easily reject memory of shapes belonging to different categories, i.e. originating from different ancestors. We conclude that complex categories, like simple visual form ensembles, are represented in terms of statistics including a central object, as well as category boundaries. We refer to the model proposed by Benna and Fusi ( bioRxiv 624239, 2019) that memory representation is compressed when related elements are represented by identifying their ancestor and each one’s difference from it. We suggest that ensemble mean perception, like category prototype extraction, might reflect employment at different representation levels of an essential, general representation mechanism. 
    more » « less
  3. Abstract

    Learning and recognition can be improved by sorting novel items into categories and subcategories. Such hierarchical categorization is easy when it can be performed according to learned rules (e.g., “if car, then automatic or stick shift” or “if boat, then motor or sail”). Here, we present results showing that human participants acquire categorization rules for new visual hierarchies rapidly, and that, as they do, corresponding hierarchical representations of the categorized stimuli emerge in patterns of neural activation in the dorsal striatum and in posterior frontal and parietal cortex. Participants learned to categorize novel visual objects into a hierarchy with superordinate and subordinate levels based on the objects' shape features, without having been told the categorization rules for doing so. On each trial, participants were asked to report the category and subcategory of the object, after which they received feedback about the correctness of their categorization responses. Participants trained over the course of a one‐hour‐long session while their brain activation was measured using functional magnetic resonance imaging. Over the course of training, significant hierarchy learning took place as participants discovered the nested categorization rules, as evidenced by the occurrence of a learning trial, after which performance suddenly increased. This learning was associated with increased representational strength of the newly acquired hierarchical rules in a corticostriatal network including the posterior frontal and parietal cortex and the dorsal striatum. We also found evidence suggesting that reinforcement learning in the dorsal striatum contributed to hierarchical rule learning.

    more » « less
  4. Abstract

    The grouping of sensory stimuli into categories is fundamental to cognition. Previous research in the visual and auditory systems supports a two‐stage processing hierarchy that underlies perceptual categorization: (a) a “bottom‐up” perceptual stage in sensory cortices where neurons show selectivity for stimulus features and (b) a “top‐down” second stage in higher level cortical areas that categorizes the stimulus‐selective input from the first stage. In order to test the hypothesis that the two‐stage model applies to the somatosensory system, 14 human participants were trained to categorize vibrotactile stimuli presented to their right forearm. Then, during an fMRI scan, participants actively categorized the stimuli. Representational similarity analysis revealed stimulus selectivity in areas including the left precentral and postcentral gyri, the supramarginal gyrus, and the posterior middle temporal gyrus. Crucially, we identified a single category‐selective region in the left ventral precentral gyrus. Furthermore, an estimation of directed functional connectivity delivered evidence for robust top‐down connectivity from the second to first stage. These results support the validity of the two‐stage model of perceptual categorization for the somatosensory system, suggesting common computational principles and a unified theory of perceptual categorization across the visual, auditory, and somatosensory systems.

    more » « less
  5. The PoseASL dataset consists of color and depth videos collected from ASL signers at the Linguistic and Assistive Technologies Laboratory under the direction of Matt Huenerfauth, as part of a collaborative research project with researchers at the Rochester Institute of Technology, Boston University, and the University of Pennsylvania. Access: After becoming an authorized user of Databrary, please contact Matt Huenerfauth if you have difficulty accessing this volume. We have collected a new dataset consisting of color and depth videos of fluent American Sign Language signers performing sequences ASL signs and sentences. Given interest among sign-recognition and other computer-vision researchers in red-green-blue-depth (RBGD) video, we release this dataset for use by the research community. In addition to the video files, we share depth data files from a Kinect v2 sensor, as well as additional motion-tracking files produced through post-processing of this data. Organization of the Dataset: The dataset is organized into sub-folders, with codenames such as "P01" or "P16" etc. These codenames refer to specific human signers who were recorded in this dataset. Please note that there was no participant P11 nor P14; those numbers were accidentally skipped during the process of making appointments to collect video stimuli. Task: During the recording session, the participant was met by a member of our research team who was a native ASL signer. No other individuals were present during the data collection session. After signing the informed consent and video release document, participants responded to a demographic questionnaire. Next, the data-collection session consisted of English word stimuli and cartoon videos. The recording session began with showing participants stimuli consisting of slides that displayed English word and photos of items, and participants were asked to produce the sign for each (PDF included in materials subfolder). Next, participants viewed three videos of short animated cartoons, which they were asked to recount in ASL: - Canary Row, Warner Brothers Merrie Melodies 1950 (the 7-minute video divided into seven parts) - Mr. Koumal Flies Like a Bird, Studio Animovaneho Filmu 1969 - Mr. Koumal Battles his Conscience, Studio Animovaneho Filmu 1971 The word list and cartoons were selected as they are identical to the stimuli used in the collection of the Nicaraguan Sign Language video corpora - see: Senghas, A. (1995). Children’s Contribution to the Birth of Nicaraguan Sign Language. Doctoral dissertation, Department of Brain and Cognitive Sciences, MIT. Demographics: All 14 of our participants were fluent ASL signers. As screening, we asked our participants: Did you use ASL at home growing up, or did you attend a school as a very young child where you used ASL? All the participants responded affirmatively to this question. A total of 14 DHH participants were recruited on the Rochester Institute of Technology campus. Participants included 7 men and 7 women, aged 21 to 35 (median = 23.5). All of our participants reported that they began using ASL when they were 5 years old or younger, with 8 reporting ASL use since birth, and 3 others reporting ASL use since age 18 months. Filetypes: *.avi, *_dep.bin: The PoseASL dataset has been captured by using a Kinect 2.0 RGBD camera. The output of this camera system includes multiple channels which include RGB, depth, skeleton joints (25 joints for every video frame), and HD face (1,347 points). The video resolution produced in 1920 x 1080 pixels for the RGB channel and 512 x 424 pixels for the depth channels respectively. Due to limitations in the acceptable filetypes for sharing on Databrary, it was not permitted to share binary *_dep.bin files directly produced by the Kinect v2 camera system on the Databrary platform. If your research requires the original binary *_dep.bin files, then please contact Matt Huenerfauth. *_face.txt, *_HDface.txt, *_skl.txt: To make it easier for future researchers to make use of this dataset, we have also performed some post-processing of the Kinect data. To extract the skeleton coordinates of the RGB videos, we used the Openpose system, which is capable of detecting body, hand, facial, and foot keypoints of multiple people on single images in real time. The output of Openpose includes estimation of 70 keypoints for the face including eyes, eyebrows, nose, mouth and face contour. The software also estimates 21 keypoints for each of the hands (Simon et al, 2017), including 3 keypoints for each finger, as shown in Figure 2. Additionally, there are 25 keypoints estimated for the body pose (and feet) (Cao et al, 2017; Wei et al, 2016). Reporting Bugs or Errors: Please contact Matt Huenerfauth to report any bugs or errors that you identify in the corpus. We appreciate your help in improving the quality of the corpus over time by identifying any errors. Acknowledgement: This material is based upon work supported by the National Science Foundation under award 1749376: "Collaborative Research: Multimethod Investigation of Articulatory and Perceptual Constraints on Natural Language Evolution." 
    more » « less