skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A computational framework for patient-specific surgical planning of type 1 thyroplasty
A computational framework is proposed for virtual optimization of implant configurations of type 1 thyroplasty based on patient-specific laryngeal structures reconstructed from MRI images. Through integration of a muscle mechanics-based laryngeal posturing model, a flow-structure-acoustics interaction voice production model, a real-coded genetic algorithm, and virtual implant insertion, the framework acquires the implant configuration that achieves the optimal acoustic objectives. The framework is showcased by successfully optimizing an implant that restores acoustic features of a diseased voice resulted from unilateral vocal fold paralysis (UVFP) in producing a sustained vowel utterance. The sound intensity is improved from 62 dB (UVFP) to 81 dB (post-correction).  more » « less
Award ID(s):
1652632 2328040
PAR ID:
10589011
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Acoustical Society of America (ASA)
Date Published:
Journal Name:
JASA Express Letters
Volume:
1
Issue:
12
ISSN:
2691-1191
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Skarnitzl, Radek; Volín, Jan (Ed.)
    According to the influential continuum model of phonation, only voiced segments can be specified as creaky or breathy. The present study investigated many possible phonetic correlates of the laryngeal contrast in Javanese word-initial prevocalic stop consonants, drawing upon a spoken corpus of more than 180,000 utterances. The results indicate that the laryngeal contrast is cued by voice onset time (VOT) and several acoustic-phonetic properties of the following vowel, including the first formant (F1) in addition to voice source measurements such as H1*-H2* and cepstral peak prominence (CPP). Taken together these findings indicate that Javanese stops can be both voiceless and breathy, supporting a revision of the continuum model in which voicing and other aspects of phonation are decoupled. 
    more » « less
  2. null (Ed.)
    Purpose The larynx plays a role in swallowing, respiration, and voice production. All three functions change during ontogeny. We investigated ontogenetic shape changes using a mouse model to inform our understanding of how laryngeal form and function are integrated. We understand the characterization of developmental changes to larynx anatomy as a critical step toward using rodent models to study human vocal communication disorders. Method Contrast-enhanced micro-computed tomography image stacks were used to generate three-dimensional reconstructions of the CD-1 mouse ( Mus musculus ) laryngeal cartilaginous framework. Then, we quantified size and shape in four age groups: pups, weanlings, young, and old adults using a combination of landmark and linear morphometrics. We analyzed postnatal patterns of growth and shape in the laryngeal skeleton, as well as morphological integration among four laryngeal cartilages using geometric morphometric methods. Acoustic analysis of vocal patterns was employed to investigate morphological and functional integration. Results Four cartilages scaled with negative allometry on body mass. Additionally, thyroid, arytenoid, and epiglottic cartilages, but not the cricoid cartilage, showed shape change associated with developmental age. A test for modularity between the four cartilages suggests greater independence of thyroid cartilage shape, hinting at the importance of embryological origin during postnatal development. Finally, mean fundamental frequency, but not fundamental frequency range, varied predictably with size. Conclusion In a mouse model, the four main laryngeal cartilages do not develop uniformly throughout the first 12 months of life. High-dimensional shape analysis effectively quantified variation in shape across development and in relation to size, as well as clarifying patterns of covariation in shape among cartilages and possibly the ventral pouch. Supplemental Material https://doi.org/10.23641/asha.12735917 
    more » « less
  3. This study replicates and extends the recent findings of Lee, Keating, and Kreiman [J. Acoust. Soc. Am. 146(3), 1568–1579 (2019)] on acoustic voice variation in read speech, which showed remarkably similar acoustic voice spaces for groups of female and male talkers and the individual talkers within these groups. Principal component analysis was applied to acoustic indices of voice quality measured from phone conversations for 99/100 of the same talkers studied previously. The acoustic voice spaces derived from spontaneous speech are highly similar to those based on read speech, except that unlike read speech, variability in fundamental frequency accounted for significant acoustic variability. Implications of these findings for prototype models of speaker recognition and discrimination are considered. 
    more » « less
  4. Automatic assessment of depression from speech signals is affected by variabilities in acoustic content and speakers. In this study, we focused on addressing these variabilities. We used a database comprised of recordings of interviews from a large number of female speakers: 735 individuals suffering from depressive (dysthymia and major depression) and anxiety disorders (generalized anxiety disorder, panic disorder with or without agoraphobia) and 953 healthy individuals. Leveraging this unique and extensive database, we built an i-vector framework. In order to capture various aspects of speech signals, we used voice quality features in addition to conventional cepstral features. The features (F0, F1, F2, F3, H1-H2, H2-H4, H4-H2k, A1, A2, A3, and CPP) were inspired by a psychoacoustic model of voice quality [1]. An i-vector-based system using Mel Frequency Cepstral Coefficients (MFCCs) and another using voice quality features was developed. Voice quality features performed as well as MFCCs. A score-level fusion was then used to combine these two systems, resulting in a 6% relative improvement in accuracy in comparison with the i-vector system based on MFCCs alone. The system was robust even when the duration of the utterances was shortened to 10 seconds. 
    more » « less
  5. This paper presents the design and implementation of Scribe, a comprehensive voice processing and handwriting interface for voice assistants. Distinct from prior works, Scribe is a precise tracking interface that can co-exist with the voice interface on low sampling rate voice assistants. Scribe can be used for 3D free-form drawing, writing, and motion tracking for gaming. Taking handwriting as a specific application, it can also capture natural strokes and the individualized style of writing while occupying only a single frequency. The core technique includes an accurate acoustic ranging method called Cross Frequency Continuous Wave (CFCW) sonar, enabling voice assistants to use ultrasound as a ranging signal while using the regular microphone system of voice assistants as a receiver. We also design a new optimization algorithm that only requires a single frequency for time difference of arrival. Scribe prototype achieves 73 μm of median error for 1D ranging and 1.4 mm of median error in 3D tracking of an acoustic beacon using the microphone array used in voice assistants. Our implementation of an in-air handwriting interface achieves 94.1% accuracy with automatic handwriting-to-text software, similar to writing on paper (96.6%). At the same time, the error rate of voice-based user authentication only increases from 6.26% to 8.28%. 
    more » « less