Comparing human and machine's use of coarticulatory vowel nasalization for linguistic classification

Zellou, Georgia; Kim, Lila; Gendrot, Cédric

doi:10.1121/10.0027932

Citation Details

Comparing human and machine's use of coarticulatory vowel nasalization for linguistic classification

Anticipatory coarticulation is a highly informative cue to upcoming linguistic information: listeners can identify that the word is ben and not bed by hearing the vowel alone. The present study compares the relative performances of human listeners and a self-supervised pre-trained speech model (wav2vec 2.0) in the use of nasal coarticulation to classify vowels. Stimuli consisted of nasalized (from CVN words) and non-nasalized (from CVCs) American English vowels produced by 60 humans and generated in 36 TTS voices. wav2vec 2.0 performance is similar to human listener performance, in aggregate. Broken down by vowel type: both wav2vec 2.0 and listeners perform higher for non-nasalized vowels produced naturally by humans. However, wav2vec 2.0 shows higher correct classification performance for nasalized vowels, than for non-nasalized vowels, for TTS voices. Speaker-level patterns reveal that listeners' use of coarticulation is highly variable across talkers. wav2vec 2.0 also shows cross-talker variability in performance. Analyses also reveal differences in the use of multiple acoustic cues in nasalized vowel classifications across listeners and the wav2vec 2.0. Findings have implications for understanding how coarticulatory variation is used in speech perception. Results also can provide insight into how neural systems learn to attend to the unique acoustic features of coarticulation. more »

Award ID(s):: 2140183

PAR ID:: 10603924

Author(s) / Creator(s):: Zellou, Georgia; Kim, Lila; Gendrot, Cédric

Publisher / Repository:: JASA

Date Published:: 2024-07-01

Journal Name:: The Journal of the Acoustical Society of America

Volume:: 156

Issue:: 1

ISSN:: 0001-4966

Page Range / eLocation ID:: 489 to 502

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1121/10.0027932

More Like this