skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A reinforcement learning approach to speech category acquisition
Adults struggle to learn non-native speech categories in many experimental settings (Goto, 1971), but learn efficiently in a video game paradigm where non-native speech sounds have functional significance (Lim and Holt, 2011). Behavioral and neural evidence from this and other paradigms point toward the involvement of reinforcement learning mechanisms in speech category learning. We formalize this hypothesis computationally and present two simulations. The first simulates the findings of Lim et al. (2019), providing proof in principle that a reinforcement learning algorithm can successfully capture human results in a video game where people are learning novel categories of noise tokens. Our second simulation extends this to speech sounds and demonstrates that our algorithm mimics second language learners’ improvement on discrimination of a non-native speech contrast. Together these two simulations show that reinforcement learning provides an accurate model of human learning in this paradigm and provide evidence supporting the hypothesis that this mechanism could play a key role in effective speech category learning in adults. Being able to identify the algorithms employed in this paradigm could provide many avenues for pedagogical changes in second language learning and let teachers harness the processes that allow for efficient learning and improvement of non-native perceptual ability.  more » « less
Award ID(s):
2120834
PAR ID:
10314285
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the Annual Boston University Conference on Language Development
ISSN:
1080-692X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The extent that articulatory information embedded in incoming speech contributes to the formation of new perceptual categories for speech sounds has been a matter of discourse for decades. It has been theorized that the acquisition of new speech sound categories requires a network of sensory and speech motor cortical areas (the “dorsal stream”) to successfully integrate auditory and articulatory information. However, it is possible that these brain regions are not sensitive specifically to articulatory information, but instead are sensitive to the abstract phonological categories being learned. We tested this hypothesis by training participants over the course of several days on an articulable non-native speech contrast and acoustically matched inarticulable nonspeech analogues. After reaching comparable levels of proficiency with the two sets of stimuli, activation was measured in fMRI as participants passively listened to both sound types. Decoding of category membership for the articulable speech contrast alone revealed a series of left and right hemisphere regions outside of the dorsal stream that have previously been implicated in the emergence of non-native speech sound categories, while no regions could successfully decode the inarticulable nonspeech contrast. Although activation patterns in the left inferior frontal gyrus (IFG), the middle temporal gyrus (MTG), and the supplementary motor area (SMA) provided better information for decoding articulable (speech) sounds compared to the inarticulable (sine wave) sounds, the finding that dorsal stream regions do not emerge as good decoders of the articulable contrast alone suggests that other factors, including the strength and structure of the emerging speech categories are more likely drivers of dorsal stream activation for novel sound learning. 
    more » « less
  2. null (Ed.)
    Abstract Early changes in infants’ ability to perceive native and nonnative speech sound contrasts are typically attributed to their developing knowledge of phonetic categories. We critically examine this hypothesis and argue that there is little direct evidence of category knowledge in infancy. We then propose an alternative account in which infants’ perception changes because they are learning a perceptual space that is appropriate to represent speech, without yet carving up that space into phonetic categories. If correct, this new account has substantial implications for understanding early language development. 
    more » « less
  3. Souza, Alessandra S (Ed.)
    What is the role of working memory over the course of non-native speech category learning? Prior work has predominantly focused on how working memory might influence learning assessed at a single timepoint. Here, we substantially extend this prior work by examining the role of working memory on speech learning performance over time (i.e., over several months) and leverage a multifaceted approach that provides key insights into how working memory influences learning accuracy, maintenance of knowledge over time, generalization ability, and decision processes. We found that the role of working memory in non-native speech learning depends on the timepoint of learning and whether individuals learned the categories at all. Among learners, across all stages of learning, working memory was associated with higher accuracy as well as faster and slightly more cautious decision making. Further, while learners and non-learners did not have substantially different working memory performance, learners had faster evidence accumulation and more cautious decision thresholds throughout all sessions. Working memory may enhance learning by facilitating rapid category acquisition in initial stages and enabling faster and slightly more careful decision-making strategies that may reduce the overall effort needed to learn. Our results have important implications for developing interventions to improve learning in naturalistic language contexts. 
    more » « less
  4. null (Ed.)
    Before they even speak, infants become attuned to the sounds of the language(s) they hear, processing native phonetic contrasts more easily than nonnative ones. For example, between 6 to 8 mo and 10 to 12 mo, infants learning American English get better at distinguishing English and [l], as in “rock” vs. “lock,” relative to infants learning Japanese. Influential accounts of this early phonetic learning phenomenon initially proposed that infants group sounds into native vowel- and consonant-like phonetic categories—like and [l] in English—through a statistical clustering mechanism dubbed “distributional learning.” The feasibility of this mechanism for learning phonetic categories has been challenged, however. Here, we demonstrate that a distributional learning algorithm operating on naturalistic speech can predict early phonetic learning, as observed in Japanese and American English infants, suggesting that infants might learn through distributional learning after all. We further show, however, that, contrary to the original distributional learning proposal, our model learns units too brief and too fine-grained acoustically to correspond to phonetic categories. This challenges the influential idea that what infants learn are phonetic categories. More broadly, our work introduces a mechanism-driven approach to the study of early phonetic learning, together with a quantitative modeling framework that can handle realistic input. This allows accounts of early phonetic learning to be linked to concrete, systematic predictions regarding infants’ attunement. 
    more » « less
  5. Most current theories and models of second language speech perception are grounded in the notion that learners acquire speech sound categories in their target language. In this paper, this classic idea in speech perception is revisited, given that clear evidence for formation of such categories is lacking in previous research. To understand the debate on the nature of speech sound representations in a second language, an operational definition of “category” is presented, and the issues of categorical perception and current theories of second language learning are reviewed. Following this, behavioral and neuroimaging evidence for and against acquisition of categorical representations is described. Finally, recommendations for future work are discussed. The paper concludes with a recommendation for integration of behavioral and neuroimaging work and theory in this area. 
    more » « less