skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Quantitative Acoustic versus Deep Learning Metrics of Lenition
Spanish voiced stops /b, d, ɡ/ surfaced as fricatives [β, ð, ɣ] in intervocalic position due to a phonological process known as spirantization or, more broadly, lenition. However, conditioned by various factors such as stress, place of articulation, flanking vowel quality, and speaking rate, phonetic studies reveal a great deal of variation and gradience of these surface forms, ranging from fricative-like to approximant-like [β⊤, ð⊤, ɣ⊤]. Several acoustic measurements have been used to quantify the degree of lenition, but none is standard. In this study, the posterior probabilities of sonorant and continuant phonological features in a corpus of Argentinian Spanish estimated by a deep learning Phonet model as measures of lenition were compared to traditional acoustic measurements of intensity, duration, and periodicity. When evaluated against known lenition factors: stress, place of articulation, surrounding vowel quality, word status, and speaking rate, the results show that sonorant and continuant posterior probabilities predict lenition patterns that are similar to those predicted by relative acoustic intensity measures and are in the direction expected by the effort-based view of lenition and previous findings. These results suggest that Phonet is a reliable alternative or additional approach to investigate the degree of lenition.  more » « less
Award ID(s):
2037266
PAR ID:
10444286
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Languages
Volume:
8
Issue:
2
ISSN:
2226-471X
Page Range / eLocation ID:
98
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. A deep learning Phonet model was evaluated as a method to measure lenition. Unlike quantitative acoustic methods, recurrent networks were trained to recognize the posterior probabilities of sonorant and continuant phonological features in a corpus of Argentinian Spanish. When applied to intervocalic and post-nasal voiced and voiceless stops, the approach yielded lenition patterns similar to those previously reported. Further, additional patterns also emerged. The results suggest the validity of the approach as an alternative or addition to quantitative acoustic measures of lenition. 
    more » « less
  2. Skarnitzl, Radek (Ed.)
    Alcohol is known to impair fine articulatory control and movements. In drunken speech, incomplete closure of the vocal tract can result in deaffrication of the English affricate sounds /tʃ/ and /ʤ/, spirantization (fricative-like production) of the stop consonants and palatalization (retraction of place of articulation) of the alveolar fricative /s/ (produced as /ʃ/). Such categorical segmental errors have been well-reported. This study employs a phonologicallyinformed neural network approach to estimate degrees of deaffrication of /tʃ/ and /ʤ/, spirantization of /t/ and /d/ and place retraction for /s/ in a corpus of intoxicated English speech. Recurrent neural networks were trained to recognize relevant phonological features [anterior], [continuant] and [strident] in a control speech corpus. Their posterior probabilities were computed over the segments produced under intoxication. The results obtained revealed both categorical and gradient errors and, thus, suggested that this new approach could reliably quantify fine-grained errors in intoxicated speech. 
    more » « less
  3. Skarnitzl, Radek (Ed.)
    Alcohol is known to impair fine articulatory control and movements. In drunken speech, incomplete closure of the vocal tract can result in deaffrication of the English affricate sounds /tʃ/ and /ʤ/, spirantization (fricative-like production) of the stop consonants and palatalization (retraction of place of articulation) of the alveolar fricative /s/ (produced as /ʃ/). Such categorical segmental errors have been well-reported. This study employs a phonologicallyinformed neural network approach to estimate degrees of deaffrication of /tʃ/ and /ʤ/, spirantization of /t/ and /d/ and place retraction for /s/ in a corpus of intoxicated English speech. Recurrent neural networks were trained to recognize relevant phonological features [anterior], [continuant] and [strident] in a control speech corpus. Their posterior probabilities were computed over the segments produced under intoxication. The results obtained revealed both categorical and gradient errors and, thus, suggested that this new approach could reliably quantify fine-grained errors in intoxicated speech. Keywords: alcohol, deaffrication, palatalization, retraction, neural network. 
    more » « less
  4. This paper applies the Autosegmental Metrical (AM) model of intonation phonology and the Spanish Tones and Break Indices (Sp_ToBI) annotation conventions to compare the intonational contours of declarative sentences in two varieties of Puerto Rican Spanish: (1) San Juan Spanish, spoken in the capital city of San Juan, and (2) Loíza Spanish, an Afro-Hispanic vernacular spoken in Loíza. The geographical proximity between these two municipalities entails constant contact within a shared linguistic space. However, speakers from San Juan perceive Loíza as a municipality that has its own peculiar way of speaking. The acoustic and phonological analysis was carried out with PRAAT to verify whether pitch accents coincide in the spontaneous speech of the two analyzed varieties. The data we examined contain an overall predominance of the bitonal pitch accents L*+H and L+ 
    more » « less
  5. As dialogue systems become more prevalent in the form of personalized assistants, there is an increasingly important role for systems which can socially engage the user by influencing social factors like rapport. For example, learning companions enhance learning through socio-motivational support and are more successful when users feel rapport. In this work, I explore social engagement in dialogue systems in terms of acoustic-prosodic entrainment; entrainment is a phenomenon where over the course of a conversation, speakers adapt their acoustic-prosodic features, becoming more similar in their pitch, intensity, or speaking rate. Correlated with rapport and task success, entrainment plays a significant role in how individuals connect; a system which can entrain has potential to improve social engagement by enhancing these factors. As a result of this work, I introduce a dialogue system which can entrain and investigate its effects on social factors like rapport. 
    more » « less