skip to main content


This content will become publicly available on June 28, 2025

Title: Extracting Lexical Features from Dialects via Interpretable Dialect Classifiers
Identifying linguistic differences between dialects of a language often requires expert knowledge and meticulous human analysis. This is largely due to the complexity and nuance involved in studying various dialects. We present a novel approach to extract distinguishing lexical features of dialects by utilizing interpretable dialect classifiers, even in the absence of human experts. We explore both posthoc and intrinsic approaches to interpretability, conduct experiments on Mandarin, Italian, and Low Saxon, and experimentally demonstrate that our method successfully identifies key language-specific lexical features that contribute to dialectal variations  more » « less
Award ID(s):
2142739 2203097 2125201
PAR ID:
10520226
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
NAACL
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Existing large language models (LLMs) that mainly focus on Standard American English (SAE) often lead to significantly worse performance when being applied to other English dialects. While existing mitigations tackle discrepancies for individual target dialects, they assume access to high-accuracy dialect identification systems. The boundaries between dialects are inherently flexible, making it difficult to categorize language into discrete predefined categories. In this paper, we propose DADA (Dialect Adaptation via Dynamic Aggregation), a modular approach to imbue SAE-trained models with multi-dialectal robustness by composing adapters which handle specific linguistic features. The compositional architecture of DADA allows for both targeted adaptation to specific dialect variants and simultaneous adaptation to various dialects. We show that DADA is effective for both single task and instruction finetuned language models, offering an extensible and interpretable framework for adapting existing LLMs to different English dialects. 
    more » « less
  2. Abstract

    A fundamental question in neurolinguistics concerns the brain regions involved in syntactic and semantic processing during speech comprehension, both at the lexical (word processing) and supra-lexical levels (sentence and discourse processing). To what extent are these regions separated or intertwined? To address this question, we introduce a novel approach exploiting neural language models to generate high-dimensional feature sets that separately encode semantic and syntactic information. More precisely, we train a lexical language model, GloVe, and a supra-lexical language model, GPT-2, on a text corpus from which we selectively removed either syntactic or semantic information. We then assess to what extent the features derived from these information-restricted models are still able to predict the fMRI time courses of humans listening to naturalistic text. Furthermore, to determine the windows of integration of brain regions involved in supra-lexical processing, we manipulate the size of contextual information provided to GPT-2. The analyses show that, while most brain regions involved in language comprehension are sensitive to both syntactic and semantic features, the relative magnitudes of these effects vary across these regions. Moreover, regions that are best fitted by semantic or syntactic features are more spatially dissociated in the left hemisphere than in the right one, and the right hemisphere shows sensitivity to longer contexts than the left. The novelty of our approach lies in the ability to control for the information encoded in the models’ embeddings by manipulating the training set. These “information-restricted” models complement previous studies that used language models to probe the neural bases of language, and shed new light on its spatial organization.

     
    more » « less
  3. We conduct a large-scale, systematic study to evaluate the existing evaluation methods for natural language generation in the context of generating online product reviews. We compare human-based evaluators with a variety of automated evaluation procedures, including discriminative evaluators that measure how well machine-generated text can be distinguished from human-written text, as well as word overlap metrics that assess how similar the generated text compares to human-written references. We determine to what extent these different evaluators agree on the ranking of a dozen of state-of-the-art generators for online product reviews. We find that human evaluators do not correlate well with discriminative evaluators, leaving a bigger question of whether adversarial accuracy is the correct objective for natural language generation. In general, distinguishing machine-generated text is challenging even for human evaluators, and human decisions correlate better with lexical overlaps. We find lexical diversity an intriguing metric that is indicative of the assessments of different evaluators. A post-experiment survey of participants provides insights into how to evaluate and improve the quality of natural language generation systems. 
    more » « less
  4. In this work, we induce character-level noise in various forms when fine-tuning BERT to enable zero-shot cross-lingual transfer to unseen dialects and languages. We fine-tune BERT on three sentence-level classification tasks and evaluate our approach on an assortment of unseen dialects and languages. We find that character-level noise can be an extremely effective agent of cross-lingual transfer under certain conditions, while it is not as helpful in others. Specifically, we explore these differences in terms of the nature of the task and the relationships between source and target languages, finding that introduction of character-level noise during fine-tuning is particularly helpful when a task draws on surface level cues and the source-target cross-lingual pair has a relatively high lexical overlap with shorter (i.e., less meaningful) unseen tokens on average. 
    more » « less
  5. null (Ed.)
    Purpose The extant literature suggests that individual differences in speech perception can be linked to broad receptive language phenotype. For example, a recent study found that individuals with a smaller receptive vocabulary showed diminished lexically guided perceptual learning compared to individuals with a larger receptive vocabulary. Here, we examined (a) whether such individual differences stem from variation in reliance on lexical information or variation in perceptual learning itself and (b) whether a relationship exists between lexical recruitment and lexically guided perceptual learning more broadly, as predicted by current models of lexically guided perceptual learning. Method In Experiment 1, adult participants ( n = 70) completed measures of receptive and expressive language ability, lexical recruitment, and lexically guided perceptual learning. In Experiment 2, adult participants ( n = 120) completed the same lexical recruitment and lexically guided perceptual learning tasks to provide a high-powered replication of the primary findings from Experiment 1. Results In Experiment 1, individuals with weaker receptive language ability showed increased lexical recruitment relative to individuals with higher receptive language ability; however, receptive language ability did not predict the magnitude of lexically guided perceptual learning. Moreover, the results of both experiments converged to show no evidence indicating a relationship between lexical recruitment and lexically guided perceptual learning. Conclusion The current findings suggest that (a) individuals with weaker language ability demonstrate increased reliance on lexical information for speech perception compared to those with stronger receptive language ability; (b) individuals with weaker language ability maintain an intact perceptual learning mechanism; and, (c) to the degree that the measures used here accurately capture individual differences in lexical recruitment and lexically guided perceptual learning, there is no graded relationship between these two constructs. 
    more » « less