An exploratory study on dialect density estimation for children and adult's African American English

Johnson, Alexander; Shankar, Natarajan Balaji; Ostendorf, Mari; Alwan, Abeer

doi:10.1121/10.0025771

Citation Details

An exploratory study on dialect density estimation for children and adult's African American English

This paper evaluates an innovative framework for spoken dialect density prediction on children's and adults' African American English. A speaker's dialect density is defined as the frequency with which dialect-specific language characteristics occur in their speech. Rather than treating the presence or absence of a target dialect in a user's speech as a binary decision, instead, a classifier is trained to predict the level of dialect density to provide a higher degree of specificity in downstream tasks. For this, self-supervised learning representations from HuBERT, handcrafted grammar-based features extracted from ASR transcripts, prosodic features, and other feature sets are experimented with as the input to an XGBoost classifier. Then, the classifier is trained to assign dialect density labels to short recorded utterances. High dialect density level classification accuracy is achieved for child and adult speech and demonstrated robust performance across age and regional varieties of dialect. Additionally, this work is used as a basis for analyzing which acoustic and grammatical cues affect machine perception of dialect. more »

Award ID(s):: 2202585 2202049

PAR ID:: 10506584

Author(s) / Creator(s):: Johnson, Alexander; Shankar, Natarajan Balaji; Ostendorf, Mari; Alwan, Abeer

Publisher / Repository:: Acoustical Society of America

Date Published:: 2024-04-01

Journal Name:: The Journal of the Acoustical Society of America

Volume:: 155

Issue:: 4

ISSN:: 0001-4966

Page Range / eLocation ID:: 2836 to 2848

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1121/10.0025771

More Like this