skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Evaluating computational models of infant phonetic learning across languages
In the first year of life, infants' speech perception becomes attuned to the sounds of their native language. Many accounts of this early phonetic learning exist, but computational models predicting the attunement patterns observed in infants from the speech input they hear have been lacking. A recent study presented the first such model, drawing on algorithms proposed for unsupervised learning from naturalistic speech, and tested it on a single phone contrast. Here we study five such algorithms, selected for their potential cognitive relevance. We simulate phonetic learning with each algorithm and perform tests on three phone contrasts from different languages, comparing the results to infants' discrimination patterns. The five models display varying degrees of agreement with empirical observations, showing that our approach can help decide between candidate mechanisms for early phonetic learning, and providing insight into which aspects of the models are critical for capturing infants' perceptual development.  more » « less
Award ID(s):
1734245
PAR ID:
10176647
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proceedings of the Annual Conference of the Cognitive Science Society
ISSN:
1069-7977
Page Range / eLocation ID:
571-577
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Before they even speak, infants become attuned to the sounds of the language(s) they hear, processing native phonetic contrasts more easily than nonnative ones. For example, between 6 to 8 mo and 10 to 12 mo, infants learning American English get better at distinguishing English and [l], as in “rock” vs. “lock,” relative to infants learning Japanese. Influential accounts of this early phonetic learning phenomenon initially proposed that infants group sounds into native vowel- and consonant-like phonetic categories—like and [l] in English—through a statistical clustering mechanism dubbed “distributional learning.” The feasibility of this mechanism for learning phonetic categories has been challenged, however. Here, we demonstrate that a distributional learning algorithm operating on naturalistic speech can predict early phonetic learning, as observed in Japanese and American English infants, suggesting that infants might learn through distributional learning after all. We further show, however, that, contrary to the original distributional learning proposal, our model learns units too brief and too fine-grained acoustically to correspond to phonetic categories. This challenges the influential idea that what infants learn are phonetic categories. More broadly, our work introduces a mechanism-driven approach to the study of early phonetic learning, together with a quantitative modeling framework that can handle realistic input. This allows accounts of early phonetic learning to be linked to concrete, systematic predictions regarding infants’ attunement. 
    more » « less
  2. null (Ed.)
    Abstract Early changes in infants’ ability to perceive native and nonnative speech sound contrasts are typically attributed to their developing knowledge of phonetic categories. We critically examine this hypothesis and argue that there is little direct evidence of category knowledge in infancy. We then propose an alternative account in which infants’ perception changes because they are learning a perceptual space that is appropriate to represent speech, without yet carving up that space into phonetic categories. If correct, this new account has substantial implications for understanding early language development. 
    more » « less
  3. Abstract Words in infant-directed speech (IDS) are often phonetically reduced. This likely renders words harder for infants to learn and recognize. This difficulty might be mitigated by the repetitive nature of IDS, in particular if reduced instances are often preceded by clear instances (i.e., the first-mention effect). To characterize phonetic clarity in American English word repetitions, words were extracted from the IDS of eight mothers and presented to adults (n = 36) who judged their clarity. First mentions of repeated words were found to be clearer than second mentions, though this effect was small. Clarity was rated as greater for less common words and for utterance-final words. Clarity was also greater for words parents thought their child knew. The results help guide intuitions about the phonetic problem infants face when learning their first words. 
    more » « less
  4. Human listeners are better at telling apart speakers of their native language than speakers of other languages, a phenomenon known as the language familiarity effect. The recent observation of such an effect in infants as young as 4.5 months of age (Fecher & Johnson, in press) has led to new difficulties for theories of the effect. On the one hand, retaining classical accounts—which rely on sophisticated knowledge of the native language (Goggin, Thompson, Strube, & Simental, 1991)–requires an explanation of how infants could acquire this knowledge so early. On the other hand, letting go of these accounts requires an explanation of how the effect could arise in the absence of such knowledge. In this paper, we build on algorithms from unsupervised machine learning and zero-resource speech technology to propose, for the first time, a feasible acquisition mechanism for the language familiarity effect in infants. Our results show how, without relying on sophisticated linguistic knowledge, infants could develop a language familiarity effect through statistical modeling at multiple time-scales of the acoustics of the speech signal to which they are exposed. 
    more » « less
  5. To begin learning their language, infants must locate words in the speech signal. Some models of word discovery presuppose that the discovery process depends on identifying phonetic segments (phones) in speech. To test the plausibility of models arguing that infants can reliably categorize consonants in speech, adult native speakers were asked to identify the consonant in vowel-consonant-vowel sequences extracted from spontaneous English infant-directed speech. Listeners could consistently identify some instances of consonants (for example, correctly indicating that an /s/ was an /s/). But many tokens (about half) were not consistently identifiable. Performance was significantly worse for codas than onsets. Providing the full utterance context in low-pass-filtered form did not aid recognition, nor did familiarization with the talker. In a second task, listeners were barely above chance in guessing whether a consonant was a word onset or a word-final coda. Performance on infant-directed speech was not markedly better than performance on a comparison set of adult-directed speech consonants. Erroneous responses frequently had little systematic resemblance to the correct answer. The results suggest that it is not plausible that infants can parse most utterances exhaustively into strings of uttered speech sounds and feed those strings into a statistical clustering mechanism. 
    more » « less