To begin learning their language, infants must locate words in the speech signal. Some models of word discovery presuppose that the discovery process depends on identifying phonetic segments (phones) in speech. To test the plausibility of models arguing that infants can reliably categorize consonants in speech, adult native speakers were asked to identify the consonant in vowel-consonant-vowel sequences extracted from spontaneous English infant-directed speech. Listeners could consistently identify some instances of consonants (for example, correctly indicating that an /s/ was an /s/). But many tokens (about half) were not consistently identifiable. Performance was significantly worse for codas than onsets. Providing the full utterance context in low-pass-filtered form did not aid recognition, nor did familiarization with the talker. In a second task, listeners were barely above chance in guessing whether a consonant was a word onset or a word-final coda. Performance on infant-directed speech was not markedly better than performance on a comparison set of adult-directed speech consonants. Erroneous responses frequently had little systematic resemblance to the correct answer. The results suggest that it is not plausible that infants can parse most utterances exhaustively into strings of uttered speech sounds and feed those strings into a statistical clustering mechanism.
more »
« less
Speech can produce jet-like transport relevant to asymptomatic spreading of virus
Many scientific reports document that asymptomatic and presymptomatic individuals contribute to the spread of COVID-19, probably during conversations in social interactions. Droplet emission occurs during speech, yet few studies document the flow to provide the transport mechanism. This lack of understanding prevents informed public health guidance for risk reduction and mitigation strategies, e.g., the “6-foot rule.” Here we analyze flows during breathing and speaking, including phonetic features, using orders-of-magnitude estimates, numerical simulations, and laboratory experiments. We document the spatiotemporal structure of the expelled airflow. Phonetic characteristics of plosive sounds like “P” lead to enhanced directed transport, including jet-like flows that entrain the surrounding air. We highlight three distinct temporal scaling laws for the transport distance of exhaled material including 1) transport over a short distance (<0.5 m) in a fraction of a second, with large angular variations due to the complexity of speech; 2) a longer distance, ∼1 m, where directed transport is driven by individual vortical puffs corresponding to plosive sounds; and 3) a distance out to about 2 m, or even farther, where sequential plosives in a sentence, corresponding effectively to a train of puffs, create conical, jet-like flows. The latter dictates the long-time transport in a conversation. We believe that this work will inform thinking about the role of ventilation, aerosol transport in disease transmission for humans and other animals, and yield a better understanding of linguistic aerodynamics, i.e., aerophonetics.
more »
« less
- Award ID(s):
- 2029370
- PAR ID:
- 10225044
- Date Published:
- Journal Name:
- Proceedings of the National Academy of Sciences
- Volume:
- 117
- Issue:
- 41
- ISSN:
- 0027-8424
- Page Range / eLocation ID:
- 25237 to 25245
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Before they even speak, infants become attuned to the sounds of the language(s) they hear, processing native phonetic contrasts more easily than nonnative ones. For example, between 6 to 8 mo and 10 to 12 mo, infants learning American English get better at distinguishing English and [l], as in “rock” vs. “lock,” relative to infants learning Japanese. Influential accounts of this early phonetic learning phenomenon initially proposed that infants group sounds into native vowel- and consonant-like phonetic categories—like and [l] in English—through a statistical clustering mechanism dubbed “distributional learning.” The feasibility of this mechanism for learning phonetic categories has been challenged, however. Here, we demonstrate that a distributional learning algorithm operating on naturalistic speech can predict early phonetic learning, as observed in Japanese and American English infants, suggesting that infants might learn through distributional learning after all. We further show, however, that, contrary to the original distributional learning proposal, our model learns units too brief and too fine-grained acoustically to correspond to phonetic categories. This challenges the influential idea that what infants learn are phonetic categories. More broadly, our work introduces a mechanism-driven approach to the study of early phonetic learning, together with a quantitative modeling framework that can handle realistic input. This allows accounts of early phonetic learning to be linked to concrete, systematic predictions regarding infants’ attunement.more » « less
-
We give an algorithm that computes exact maximum flows and minimum-cost flows on directed graphs with m edges and polynomially bounded integral demands, costs, and capacities in m^{1+o(1)} time. Our algorithm builds the flow through a sequence of m^{1+o(1)} approximate undirected minimum-ratio cycles, each of which is computed and processed in amortized m^{o(1)} time using a new dynamic graph data structure. Our framework extends to algorithms running in m^{1+o(1)} time for computing flows that minimize general edge-separable convex functions to high accuracy. This gives almost-linear time algorithms for several problems including entropy-regularized optimal transport, matrix scaling, p-norm flows, and p-norm isotonic regression on arbitrary directed acyclic graphs.more » « less
-
Speech sounds exist in a complex acoustic–phonetic space, and listeners vary in the extent to which they are sensitive to variability within the speech sound category (“gradience”) and the degree to which they show stable, consistent responses to phonetic stimuli. Here, we investigate the hypothesis that individual differences in the perception of the sound categories of one's language may aid speech-in-noise performance across the adult lifespan. Declines in speech-in-noise performance are well documented in healthy aging, and are, unsurprisingly, associated with differences in hearing ability. Nonetheless, hearing status and age are incomplete predictors of speech-in-noise performance, and long-standing research suggests that this ability draws on more complex cognitive and perceptual factors. In this study, a group of adults ranging in age from 18 to 67 years performed online assessments designed to measure phonetic category sensitivity, questionnaires querying recent noise exposure history and demographic factors, and crucially, a test of speech-in-noise perception. Results show that individual differences in the perception of two consonant contrasts significantly predict speech-in-noise performance, even after accounting for age and recent noise exposure history. This finding supports the hypothesis that individual differences in sensitivity to phonetic categories mediates speech perception in challenging listening situations.more » « less
-
In the first year of life, infants' speech perception becomes attuned to the sounds of their native language. Many accounts of this early phonetic learning exist, but computational models predicting the attunement patterns observed in infants from the speech input they hear have been lacking. A recent study presented the first such model, drawing on algorithms proposed for unsupervised learning from naturalistic speech, and tested it on a single phone contrast. Here we study five such algorithms, selected for their potential cognitive relevance. We simulate phonetic learning with each algorithm and perform tests on three phone contrasts from different languages, comparing the results to infants' discrimination patterns. The five models display varying degrees of agreement with empirical observations, showing that our approach can help decide between candidate mechanisms for early phonetic learning, and providing insight into which aspects of the models are critical for capturing infants' perceptual development.more » « less
An official website of the United States government

