Acoustic analysis of typically developing elementary school-aged (prepubertal) children’s speech has been primarily performed on cross-sectional data in the past. Few studies have examined longitudinal data in this age group. For this presentation, we analyze the developmental changes in the acoustic properties of children’s speech using data collected longitudinally over four years (from first grade to fourth grade). Four male and four female children participated in this study. Data were collected once every year for each child. Using these data, we measured the four-year development of subglottal acoustics (first two subglottal resonances) and vowel acoustics (first four formants and fundamental frequency). Subglottal acoustic measurements are relatively independent of context, and average values were obtained for each child in each year. Vowel acoustics measurements were made for seven vowels (i, ɪ, ɛ, æ, ʌ, ɑ, u), each occurring in two different words in the stressed syllable. We investigated the correlations between the children’s subglottal acoustics, vowel acoustics, and growth-related variables such as standing height, sitting height, and chronological age. Gender-, vowel-, and child-specific analyses were carried out in order to shed light on how typically developing speech acoustics depend on such variables. [Work supported, in part, by the NSF.]
more »
« less
Developmental Articulatory and Acoustic Features for Six to Ten Year Old Children
In this paper, we study speech development in children using longitudinal acoustic and articulatory data. Data were collected yearly from grade 1 to grade 4 from four female and four male children. We analyze acoustic and articulatory properties of four corner vowels: /æ/, /i/, /u/, and /A/, each occurring in two different words (different surrounding contexts). Acoustic features include formant frequencies and subglottal resonances (SGRs). Articulatory features include tongue curvature degree (TCD) and tongue curvature position (TCP). Based on the analyses, we observe the emergence of sex-based differences starting from grade 2. Similar to adults, the SGRs divide the vowel space into high, low, front, and back regions at least as early as grade 2. On average, TCD is correlated with vowel height and TCP with vowel frontness. Children in our study used varied articulatory configurations to achieve similar acoustic targets.
more »
« less
- Award ID(s):
- 2006979
- PAR ID:
- 10470344
- Publisher / Repository:
- ISCA
- Date Published:
- Page Range / eLocation ID:
- 4598 to 4602
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
A system for the lateral transfer of information from end-to-end neural networks recognizing articulatory feature classes to similarly structured networks recognizing phone tokens is here proposed. The system connects recurrent layers of feature detectors pre-trained on a base language to recurrent layers of a phone recognizer for a different target language, this inspired primarily by the progressive neural network scheme. Initial experiments used detectors trained on Bengali speech for four articulatory feature classes—consonant place, consonant manner, vowel height, and vowel backness—attached to phone recognizers for four other Asian languages (Javanese, Nepali, Sinhalese, and Sundanese). While these do not currently suggest consistent performance improvements across different low-resource settings for target languages, irrespective of their genealogic or phonological relatedness to Bengali, they do suggest the need for further trials with different language sets, altered data sources and data configurations, and slightly altered network setups.more » « less
-
Spanish voiced stops /b, d, ɡ/ surfaced as fricatives [β, ð, ɣ] in intervocalic position due to a phonological process known as spirantization or, more broadly, lenition. However, conditioned by various factors such as stress, place of articulation, flanking vowel quality, and speaking rate, phonetic studies reveal a great deal of variation and gradience of these surface forms, ranging from fricative-like to approximant-like [β⊤, ð⊤, ɣ⊤]. Several acoustic measurements have been used to quantify the degree of lenition, but none is standard. In this study, the posterior probabilities of sonorant and continuant phonological features in a corpus of Argentinian Spanish estimated by a deep learning Phonet model as measures of lenition were compared to traditional acoustic measurements of intensity, duration, and periodicity. When evaluated against known lenition factors: stress, place of articulation, surrounding vowel quality, word status, and speaking rate, the results show that sonorant and continuant posterior probabilities predict lenition patterns that are similar to those predicted by relative acoustic intensity measures and are in the direction expected by the effort-based view of lenition and previous findings. These results suggest that Phonet is a reliable alternative or additional approach to investigate the degree of lenition.more » « less
-
We present the first ultrasound analysis of the secondary palatalization contrast in Irish, analyzing data from five speakers from the Connemara dialect group. Word-initial /pʲ(bʲ),pˠ(bˠ),tʲ,tˠ,kʲ,kˠ,fʲ,fˠ,sʲ,sˠ,xʲ,xˠ/ are analyzed in the context of /iː,uː/. We find, first, that tongue body position robustly distinguishes palatalized from velarized consonants, across place of articulation, manner, and vowel place contexts, with palatalized consonants having fronter and/or higher tongue body realizations than their velarized counterparts. This conclusion holds equally for labial consonants, contrary to some previous descriptive claims. Second, the nature and degree of palatalization and velarization depend in systematic ways on consonant place and manner. In coronal consonants, for example, velarization is weaker or absent. Third, the Irish consonants examined resist coarticulation in backness with a following vowel. In all of these respects Irish palatalization is remarkably similar to that of Russian. Our results also support an independent role for pharyngeal cavity expansion/retraction in the production of the palatalization contrast. Finally, we discuss preliminary findings on the dynamics of the secondary articulation gestures. Our use of principal component analysis (PCA) in reaching these findings is also of interest, since PCA has not been employed a great deal in analyses of tongue body movement.more » « less
-
Vowels vary in their acoustic similarity across regional dialects of American English, such that some vowels are more similar to one another in some dialects than others. Acoustic vowel distance measures typically evaluate vowel similarity at a discrete time point, resulting in distance estimates that may not fully capture vowel similarity in formant trajectory dynamics. In the current study, language and accent distance measures, which evaluate acoustic distances between talkers over time, were applied to the evaluation of vowel category similarity within talkers. These vowel category distances were then compared across dialects, and their utility in capturing predicted patterns of regional dialect variation in American English was examined. Dynamic time warping of mel-frequency cepstral coefficients was used to assess acoustic distance across the frequency spectrum and captured predicted Southern American English vowel similarity. Root-mean-square distance and generalized additive mixed models were used to assess acoustic distance for selected formant trajectories and captured predicted Southern, New England, and Northern American English vowel similarity. Generalized additive mixed models captured the most predicted variation, but, unlike the other measures, do not return a single acoustic distance value. All three measures are potentially useful for understanding variation in vowel category similarity across dialects.more » « less