skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: “They sure aren’t from around here”: Children’s perception of accent distance in L1 and L2 varieties of English
Abstract Children exhibit preferences for familiar accents early in life. However, they frequently have more difficulty distinguishing between first language (L1) accents than second language (L2) accents in categorization tasks. Few studies have addressed children’s perception of accent strength, or the relation between accent strength and objective measures of pronunciation distance. To address these gaps, 6- and 12-year-olds and adults ranked talkers’ perceived distance from the local accent (i.e., Midland American English). Rankings were compared with objective distance measures. Acoustic and phonetic distance measures were significant predictors of ladder rankings, but there was no evidence that children and adults significantly differed in their sensitivity to accent strength. Levenshtein Distance, a phonetic distance metric, was the strongest predictor of perceptual rankings for both children and adults. As a percept, accent strength has critical implications for social judgments, which determine real world social outcomes for talkers with non-local accents.  more » « less
Award ID(s):
1941691 1941662
PAR ID:
10560259
Author(s) / Creator(s):
; ;
Publisher / Repository:
Cambridge University Press
Date Published:
Journal Name:
Journal of Child Language
ISSN:
0305-0009
Page Range / eLocation ID:
1 to 24
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Listeners attend to variation in segmental and prosodic cues when judging accent strength. The relative contributions of these cues to perceptions of accentedness in English remains open for investigation, although objective accent distance measures (such as Levenshtein distance) appear to be reliable tools for predicting perceptual distance. Levenshtein distance, however, only accounts for phonemic information in the signal. The purpose of the current study was to examine the relative contributions of phonemic (Levenshtein) and holistic acoustic (dynamic time warping) distances from the local accent to listeners’ accent rankings for nine non-local native and nonnative accents. Listeners (n =52) ranked talkers on perceived distance from the local accent (Midland American English) using a ladder task for three sentence-length stimuli. Phonemic and holistic acoustic distances between Midland American English and the other accents were quantified using both weighted and unweighted Levenshtein distance measures, and dynamic time warping (DTW). Results reveal that all three metrics contribute to perceived accent distance, with the weighted Levenshtein slightly outperforming the other measures. Moreover, the relative contribution of phonemic and holistic acoustic cues was driven by the speaker’s accent. Both nonnative and non-local native accents were included in this study, and the benefits of considering both of these accent groups in studying phonemic and acoustic cues used by listeners is discussed. 
    more » « less
  2. Unfamiliar accents can cause word recognition challenges, particularly in noisy environments, but few studies have incorporated quantitative pronunciation distance metrics to explain intelligibility differences across accents. To address this gap, intelligibility was measured for 18 talkers -- two from each of three first-language, one bilingual, and five second-language accents -- in quiet and two noise conditions. The relations between two edit distance metrics, which quantify phonetic differences from a reference accent, and intelligibility scores were assessed. Intelligibility was quantified through both fuzzy string matching and percent words correct. Both edit distance metrics were significantly related to intelligibility scores; a heuristic edit distance metric was the best predictor of intelligibility for both scoring methods. Further, there were stronger effects of edit distance as the listening condition increased in difficulty. Talker accent also contributed substantially to intelligibility models, but relations between accent and edit distance did not consistently pattern for the two talkers representing each accent. Frequency of production differences in vowels and consonants was negatively correlated with intelligibility, particularly for consonants. Together, these results suggest that significant amounts of variability in intelligibility across accents can be predicted by phonetic differences from the listener’s home accent. However, talker- and accent-specific pronunciation features, including suprasegmental characteristics, must be quantified to fully explain intelligibility across talkers and listening conditions. 
    more » « less
  3. Radek Skarnitzl & Jan Volín (Ed.)
    Unfamiliar native and non-native accents can cause word recognition challenges, particularly in noisy environments, but few studies have incorporated quantitative pronunciation distance metrics to explain intelligibility differences across accents. Here, intelligibility was measured for 18 talkers -- two from each of three native, one bilingual, and five non- native accents -- in three listening conditions (quiet and two noise conditions). Two variations of the Levenshtein pronunciation distance metric, which quantifies phonemic differences from a reference accent, were assessed for their ability to predict intelligibility. An unweighted Levenshtein distance metric was the best intelligibility predictor; talker accent further predicted performance. Accuracy did not fall along a native - non-native divide. Thus, phonemic differences from the listener’s home accent primarily determine intelligibility, but other accent- specific pronunciation features, including suprasegmental characteristics, must be quantified to fully explain intelligibility across talkers and listening conditions. These results have implications for pedagogical practices and speech perception theories. 
    more » « less
  4. Vowels vary in their acoustic similarity across regional dialects of American English, such that some vowels are more similar to one another in some dialects than others. Acoustic vowel distance measures typically evaluate vowel similarity at a discrete time point, resulting in distance estimates that may not fully capture vowel similarity in formant trajectory dynamics. In the current study, language and accent distance measures, which evaluate acoustic distances between talkers over time, were applied to the evaluation of vowel category similarity within talkers. These vowel category distances were then compared across dialects, and their utility in capturing predicted patterns of regional dialect variation in American English was examined. Dynamic time warping of mel-frequency cepstral coefficients was used to assess acoustic distance across the frequency spectrum and captured predicted Southern American English vowel similarity. Root-mean-square distance and generalized additive mixed models were used to assess acoustic distance for selected formant trajectories and captured predicted Southern, New England, and Northern American English vowel similarity. Generalized additive mixed models captured the most predicted variation, but, unlike the other measures, do not return a single acoustic distance value. All three measures are potentially useful for understanding variation in vowel category similarity across dialects. 
    more » « less
  5. Recent work on perceptual learning for speech has suggested that while high-variability training typically results in generalization, low-variability exposure can sometimes be sufficient for cross-talker generalization. We tested predictions of a similarity-based account, according to which, generalization depends on training-test talker similarity rather than on exposure to variability. We compared perceptual adaptation to second-language (L2) speech following single- or multiple-talker training with a round-robin design in which four L2 English talkers from four different first-language (L1) backgrounds served as both training and test talkers. After exposure to 60 L2 English sentences in one training session, cross-talker/cross-accent generalization was possible (but not guaranteed) following either multiple- or single-talker training with variation across training-test talker pairings. Contrary to predictions of the similarity-based account, adaptation was not consistently better for identical than for mismatched training-test talker pairings, and generalization patterns were asymmetrical across training-test talker pairs. Acoustic analyses also revealed a dissociation between phonetic similarity and cross-talker/cross-accent generalization. Notably, variation in adaptation and generalization related to variation in training phase intelligibility. Together with prior evidence, these data suggest that perceptual learning for speech may benefit from some combination of exposure to talker variability, training-test similarity, and high training phase intelligibility. 
    more » « less