NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Direct articulatory observation reveals phoneme recognition performance characteristics of a self-supervised speech model

https://doi.org/10.1121/10.0034430

Shi, Xuan; Feng, Tiantian; Huang, Kevin; Kadiri, Sudarsana_Reddy; Lee, Jihwan; Lu, Yijing; Zhang, Yubin; Goldstein, Louis; Narayanan, Shrikanth (November 2024, JASA Express Letters)

Variability in speech pronunciation is widely observed across different linguistic backgrounds, which impacts modern automatic speech recognition performance. Here, we evaluate the performance of a self-supervised speech model in phoneme recognition using direct articulatory evidence. Findings indicate significant differences in phoneme recognition, especially in front vowels, between American English and Indian English speakers. To gain a deeper understanding of these differences, we conduct real-time MRI-based articulatory analysis, revealing distinct velar region patterns during the production of specific front vowels. This underscores the need to deepen the scientific understanding of self-supervised speech model variances to advance robust and inclusive speech technology.
more » « less
Analysis of articulatory setting for L1 and L2 English speakers using MRI data

https://doi.org/10.21437/Interspeech.2024-2175

Huang, Kevin; Goldberg, Jack; Goldstein, Louis; Narayanan, Shrikanth (September 2024, ISCA)

Full Text Available
Deep Speech Synthesis from MRI-Based Articulatory Representations

https://doi.org/10.21437/Interspeech.2023-2316

Wu, Peter; Li, Tingle; Lu, Yijing; Zhang, Yubin; Lian, Jiachen; Black, Alan W; Goldstein, Louis; Watanabe, Shinji; Anumanchipalli, Gopala K. (August 2023, ISCA)
Articulatory Representation Learning via Joint Factor Analysis and Neural Matrix Factorization

https://doi.org/10.1109/ICASSP49357.2023.10096401

Lian, Jiachen; Black, Alan W; Lu, Yijing; Goldstein, Louis; Watanabe, Shinji; Anumanchipalli, Gopala K. (June 2023, IEEE)

Full Text Available
Speaker-Independent Acoustic-to-Articulatory Speech Inversion

https://doi.org/10.1109/ICASSP49357.2023.10096796

Wu, Peter; Chen, Li-Wei; Cho, Cheol Jun; Watanabe, Shinji; Goldstein, Louis; Black, Alan W; Anumanchipalli, Gopala K. (June 2023, IEEE)
Deep Neural Convolutive Matrix Factorization for Articulatory Representation Decomposition

https://doi.org/10.21437/Interspeech.2022-11233

Lian, Jiachen; Black, Alan W; Goldstein, Louis; Anumanchipalli, Gopala Krishna (September 2022, Interspeech 2022)

Full Text Available
Deep Speech Synthesis from Articulatory Representations

https://doi.org/10.21437/Interspeech.2022-10892

Wu, Peter; Watanabe, Shinji; Goldstein, Louis; Black, Alan W; Anumanchipalli, Gopala Krishna (September 2022, Interspeech 2022)

Full Text Available

Search for: All records