Automatic recognition of second language speech-in-noise

Kim, Seung-Eun (ORCID:0000000262477328); Chernyak, Bronya_R; Seleznova, Olga; Keshet, Joseph; Goldrick, Matthew; Bradlow, Ann_R (ORCID:0000000155606059)

doi:10.1121/10.0024877

Citation Details

Automatic recognition of second language speech-in-noise

Measuring how well human listeners recognize speech under varying environmental conditions (speech intelligibility) is a challenge for theoretical, technological, and clinical approaches to speech communication. The current gold standard—human transcription—is time- and resource-intensive. Recent advances in automatic speech recognition (ASR) systems raise the possibility of automating intelligibility measurement. This study tested 4 state-of-the-art ASR systems with second language speech-in-noise and found that one, whisper, performed at or above human listener accuracy. However, the content of whisper's responses diverged substantially from human responses, especially at lower signal-to-noise ratios, suggesting both opportunities and limitations for ASR--based speech intelligibility modeling. more »

Award ID(s):: 2219843

PAR ID:: 10589373

Author(s) / Creator(s):: Kim, Seung-Eun; Chernyak, Bronya_R; Seleznova, Olga; Keshet, Joseph; Goldrick, Matthew; Bradlow, Ann_R

Publisher / Repository:: Acoustical Society of America (ASA)

Date Published:: 2024-02-13

Journal Name:: JASA Express Letters

Volume:: 4

Issue:: 2

ISSN:: 2691-1191

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Journal Article:
https://doi.org/10.1121/10.0024877

More Like this