A computational framework is proposed for virtual optimization of implant configurations of type 1 thyroplasty based on patient-specific laryngeal structures reconstructed from MRI images. Through integration of a muscle mechanics-based laryngeal posturing model, a flow-structure-acoustics interaction voice production model, a real-coded genetic algorithm, and virtual implant insertion, the framework acquires the implant configuration that achieves the optimal acoustic objectives. The framework is showcased by successfully optimizing an implant that restores acoustic features of a diseased voice resulted from unilateral vocal fold paralysis (UVFP) in producing a sustained vowel utterance. The sound intensity is improved from 62 dB (UVFP) to 81 dB (post-correction).
more »
« less
Acoustic correlates of the Javanese heavy vs. light distinction: A large-scale corpus study
According to the influential continuum model of phonation, only voiced segments can be specified as creaky or breathy. The present study investigated many possible phonetic correlates of the laryngeal contrast in Javanese word-initial prevocalic stop consonants, drawing upon a spoken corpus of more than 180,000 utterances. The results indicate that the laryngeal contrast is cued by voice onset time (VOT) and several acoustic-phonetic properties of the following vowel, including the first formant (F1) in addition to voice source measurements such as H1*-H2* and cepstral peak prominence (CPP). Taken together these findings indicate that Javanese stops can be both voiceless and breathy, supporting a revision of the continuum model in which voicing and other aspects of phonation are decoupled.
more »
« less
- Award ID(s):
- 1941593
- PAR ID:
- 10492903
- Editor(s):
- Skarnitzl, Radek; Volín, Jan
- Publisher / Repository:
- Guarant International
- Date Published:
- Journal Name:
- Proceedings of the 20th International Congress of Phonetic Sciences
- ISSN:
- 2412-0669
- Page Range / eLocation ID:
- 843-847
- Subject(s) / Keyword(s):
- laryngeal contrast, acoustic correlates, phonation type, Javanese, corpus phonetics
- Format(s):
- Medium: X
- Location:
- Prague
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Two studies investigated the influence of conversational role on phonetic imitation toward human and voice-AI interlocutors. In a Word List Task, the giver instructed the receiver on which of two lists to place a word; this dialogue task is similar to simple spoken interactions users have with voice-AI systems. In a Map Task, participants completed a fill-in-the-blank worksheet with the interlocutors, a more complex interactive task. Participants completed the task twice with both interlocutors, once as giver-of-information and once as receiver-of-information. Phonetic alignment was assessed through similarity rating, analysed using mixed effects logistic regressions. In the Word List Task, participants aligned to a greater extent toward the human interlocutor only. In the Map Task, participants as giver only aligned more toward the human interlocutor. Results indicate that phonetic alignment is mediated by the type of interlocutor and that the influence of conversational role varies across tasks and interlocutors.more » « less
-
null (Ed.)Purpose The larynx plays a role in swallowing, respiration, and voice production. All three functions change during ontogeny. We investigated ontogenetic shape changes using a mouse model to inform our understanding of how laryngeal form and function are integrated. We understand the characterization of developmental changes to larynx anatomy as a critical step toward using rodent models to study human vocal communication disorders. Method Contrast-enhanced micro-computed tomography image stacks were used to generate three-dimensional reconstructions of the CD-1 mouse ( Mus musculus ) laryngeal cartilaginous framework. Then, we quantified size and shape in four age groups: pups, weanlings, young, and old adults using a combination of landmark and linear morphometrics. We analyzed postnatal patterns of growth and shape in the laryngeal skeleton, as well as morphological integration among four laryngeal cartilages using geometric morphometric methods. Acoustic analysis of vocal patterns was employed to investigate morphological and functional integration. Results Four cartilages scaled with negative allometry on body mass. Additionally, thyroid, arytenoid, and epiglottic cartilages, but not the cricoid cartilage, showed shape change associated with developmental age. A test for modularity between the four cartilages suggests greater independence of thyroid cartilage shape, hinting at the importance of embryological origin during postnatal development. Finally, mean fundamental frequency, but not fundamental frequency range, varied predictably with size. Conclusion In a mouse model, the four main laryngeal cartilages do not develop uniformly throughout the first 12 months of life. High-dimensional shape analysis effectively quantified variation in shape across development and in relation to size, as well as clarifying patterns of covariation in shape among cartilages and possibly the ventral pouch. Supplemental Material https://doi.org/10.23641/asha.12735917more » « less
-
Social and Functional Pressures in Vocal Alignment: Differences for Human and Voice-AI Interlocutorsnull (Ed.)Increasingly, people are having conversational interactions with voice-AI systems, such as Amazon’s Alexa. Do the same social and functional pressures that mediate alignment toward human interlocutors also predict align patterns toward voice-AI? We designed an interactive dialogue task to investigate this question. Each trial consisted of scripted, interactive turns between a participant and a model talker (pre-recorded from either a natural production or voice-AI): First, participants produced target words in a carrier phrase. Then, a model talker responded with an utterance containing the target word. The interlocutor responses varied by 1) communicative affect (social) and 2) correctness (functional). Finally, participants repeated the carrier phrase. Degree of phonetic alignment was assessed acoustically between the target word in the model’s response and participants’ response. Results indicate that social and functional factors distinctly mediate alignment toward AI and humans. Findings are discussed with reference to theories of alignment and human-computer interaction.more » « less
-
Automatic Speech Recognition (ASR) systems are widely used in various online transcription services and personal digital assistants. Emerging lines of research have demonstrated that ASR systems are vulnerable to hidden voice commands, i.e., audio that can be recognized by ASRs but not by humans. Such attacks, however, often either highly depend on white-box knowledge of a specific machine learning model or require special hardware to construct the adversarial audio. This paper proposes a new model-agnostic and easily-constructed attack, called CommanderGabble, which uses fast speech to camouflage voice commands. Both humans and ASR systems often misinterpret fast speech, and such misinterpretation can be exploited to launch hidden voice command attacks. Specifically, by carefully manipulating the phonetic structure of a target voice command, ASRs can be caused to derive a hidden meaning from the manipulated, high-speed version. We implement the discovered attacks both over-the-wire and over-the-air, and conduct a suite of experiments to demonstrate their efficacy against 7 practical ASR systems. Our experimental results show that the over-the-wire attacks can disguise as many as 96 out of 100 tested voice commands into adversarial ones, and that the over-the-air attacks are consistently successful for all 18 chosen commands in multiple real-world scenarios.more » « less