Our Collective Voices: The Social and Technical Values of a Grassroots Chinese Stuttered Speech Dataset

Li, Jingjin; Li, Qisheng; Gong, Rong; Wang, Lezhi; Wu, Shaomei

doi:10.1145/3715275.3732179

Citation Details

This content will become publicly available on June 23, 2026

Our Collective Voices: The Social and Technical Values of a Grassroots Chinese Stuttered Speech Dataset

The lack of authentic stuttered speech data has significantly limited the development of stuttering friendly automatic speech recognition (ASR) models. In previous work, we collaborated with StammerTalk, a grassroots community of Chinese-speaking people who stutter (PWS), to collect the first stuttered speech dataset in Mandarin Chinese, containing 50 hours of conversational and command-recitation speech from 72 PWS. This work examines both the technical and social dimensions of the dataset. Through quantitative and qualitative analysis, as well as benchmarking and fine-tuning ASR models using the dataset, we demonstrate its technical value in capturing stuttered speech at an unprecedented scale and diversity – enabling better understanding and mitigation of fluency bias in ASR – and its social value in promoting self-advocacy and structural change for PWS in China. By foregrounding lived experiences of PWS in their own voices, we also see the potential of this dataset to normalize speech disfluencies and cultivate deeper empathy for stuttering within the AI research community. more »

Award ID(s):: 2427710

PAR ID:: 10618283

Author(s) / Creator(s):: Li, Jingjin; Li, Qisheng; Gong, Rong; Wang, Lezhi; Wu, Shaomei

Publisher / Repository:: ACM

Date Published:: 2025-06-23

ISBN:: 9798400714825

Page Range / eLocation ID:: 2768 to 2783

Format(s):: Medium: X

Location:: Athens Greece

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on June 23, 2026
Conference Paper:
https://doi.org/10.1145/3715275.3732179

More Like this