skip to main content


Title: DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules
Existing large language models (LLMs) that mainly focus on Standard American English (SAE) often lead to significantly worse performance when being applied to other English dialects. While existing mitigations tackle discrepancies for individual target dialects, they assume access to high-accuracy dialect identification systems. The boundaries between dialects are inherently flexible, making it difficult to categorize language into discrete predefined categories. In this paper, we propose DADA (Dialect Adaptation via Dynamic Aggregation), a modular approach to imbue SAE-trained models with multi-dialectal robustness by composing adapters which handle specific linguistic features. The compositional architecture of DADA allows for both targeted adaptation to specific dialect variants and simultaneous adaptation to various dialects. We show that DADA is effective for both single task and instruction finetuned language models, offering an extensible and interpretable framework for adapting existing LLMs to different English dialects.  more » « less
Award ID(s):
2247357
PAR ID:
10506663
Author(s) / Creator(s):
; ;
Publisher / Repository:
Association for Computational Linguistics
Date Published:
Journal Name:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Page Range / eLocation ID:
13776 to 13793
Format(s):
Medium: X
Location:
Singapore
Sponsoring Org:
National Science Foundation
More Like this
  1. Online data collection allows for access to diverse populations. In the current study, we used online recruitment and data collection methods to obtain a corpus of read speech from adult talkers representing three authentic regional dialects of American English and one novel dialect created for the corpus. The authentic dialects (New England, Northern, and Southern American English) are each represented by 8–10 talkers, ranging in age from 22 to 75 years old. The novel dialect was produced by five Spanish-English bilinguals with training in linguistics, who were asked to produce Spanish /o/ in an otherwise English segmental context. One vowel contrast was selected for each dialect, in which the vowels within the contrast are acoustically more similar in the target dialect than in the other dialects. Each talker produced one familiar short story with 40 tokens of each vowel within the target contrast for their dialect, as well as a set of real words and nonwords that represent both the target vowel contrast for their dialect and the other three vowel contrasts for comparison across dialects. Preliminary acoustic analysis reveals both cross-dialect and within-dialect variability in the target vowel contrasts. The corpus materials are available to the scholarly community. 
    more » « less
  2. Identifying linguistic differences between dialects of a language often requires expert knowledge and meticulous human analysis. This is largely due to the complexity and nuance involved in studying various dialects. We present a novel approach to extract distinguishing lexical features of dialects by utilizing interpretable dialect classifiers, even in the absence of human experts. We explore both posthoc and intrinsic approaches to interpretability, conduct experiments on Mandarin, Italian, and Low Saxon, and experimentally demonstrate that our method successfully identifies key language-specific lexical features that contribute to dialectal variations 
    more » « less
  3. Vowels vary in their acoustic similarity across regional dialects of American English, such that some vowels are more similar to one another in some dialects than others. Acoustic vowel distance measures typically evaluate vowel similarity at a discrete time point, resulting in distance estimates that may not fully capture vowel similarity in formant trajectory dynamics. In the current study, language and accent distance measures, which evaluate acoustic distances between talkers over time, were applied to the evaluation of vowel category similarity within talkers. These vowel category distances were then compared across dialects, and their utility in capturing predicted patterns of regional dialect variation in American English was examined. Dynamic time warping of mel-frequency cepstral coefficients was used to assess acoustic distance across the frequency spectrum and captured predicted Southern American English vowel similarity. Root-mean-square distance and generalized additive mixed models were used to assess acoustic distance for selected formant trajectories and captured predicted Southern, New England, and Northern American English vowel similarity. Generalized additive mixed models captured the most predicted variation, but, unlike the other measures, do not return a single acoustic distance value. All three measures are potentially useful for understanding variation in vowel category similarity across dialects. 
    more » « less
  4. The retraction of /s/ in /str/, eg street, is a sound change found in certain English dialects. Previous work suggests that /s/-retraction arises from lower spectral frequency /s/ in /str/. The extent to which /s/-retraction differs across English dialects is unclear. This paper presents results from a large-scale, acoustic phonetic study of sibilants in 420 speakers, from 6 spontaneous speech corpora (9 dialects) of North American and Scottish English. Spectral Centre of Gravity was modelled from automatic measures of word-initial sibilants. Female speakers show higher frequency sibilants than males, but more so for /s/ than /ʃ/; /s/ is also higher in American than Canadian/Scottish dialects; /ʃ/ is surprisingly variable. /s/-retraction, modelled as retraction ratios, is generally greater for /str/ than /spr skr/, but varies by dialect; females show more retraction in /str/ than males. Dialectal and social factors clearly influence /s/-retraction in English clusters /sp st sk/, /spr skr/, and /str/. 
    more » « less
  5. Hate speech and offensive language are rampant on social media. Machine learning has provided a way to moderate foul language at scale. However, much of the current research focuses on overall performance. Models may perform poorly on text written in a minority dialectal language. For instance, a hate speech classifier may produce more false positives on tweets written in African-American Vernacular English (AAVE). To measure these problems, we need text written in both AAVE and Standard American English (SAE). Unfortunately, it is challenging to curate data for all linguistic styles in a timely manner—especially when we are constrained to specific problems, social media platforms, or by limited resources. In this paper, we answer the question, “How can we evaluate the performance of classifiers across minority dialectal languages when they are not present within a particular dataset?” Specifically, we propose an automated fairness fuzzing tool called FuzzE to quantify the fairness of text classifiers applied to AAVE text using a dataset that only contains text written in SAE. Overall, we find that the fairness estimates returned by our technique moderately correlates with the use of real ground-truth AAVE text. Warning: Offensive language is displayed in this manuscript. 
    more » « less