skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Incorporating Word-level Phonemic Decoding into Readability Assessment
Current approaches in automatic readability assessment have found success with the use of large language models and transformer architectures. These techniques lead to accuracy improvement, but they do not offer the interpretability that is uniquely required by the audience most often employing readability assessment tools: teachers and educators. Recent work that employs more traditional machine learning methods has highlighted the linguistic importance of considering semantic and syntactic characteristics of text in readability assessment by utilizing handcrafted feature sets. Research in Education suggests that, in addition to semantics and syntax, phonetic and orthographic instruction are necessary for children to progress through the stages of reading and spelling development; children must first learn to decode the letters and symbols on a page to recognize words and phonemes and their connection to speech sounds. Here, we incorporate this word-level phonemic decoding process into readability assessment by crafting a phonetically-based feature set for grade-level classification for English. Our resulting feature set shows comparable performance to much larger, semantically- and syntactically-based feature sets, supporting the linguistic value of orthographic and phonetic considerations in readability assessment.  more » « less
Award ID(s):
1763649
PAR ID:
10513286
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Date Published:
Page Range / eLocation ID:
8998–9009
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. N/A (Ed.)
    Automatic pronunciation assessment (APA) plays an important role in providing feedback for self-directed language learners in computer-assisted pronunciation training (CAPT). Several mispronunciation detection and diagnosis (MDD) systems have achieved promising performance based on end-to-end phoneme recognition. However, assessing the intelligibility of second language (L2) remains a challenging problem. One issue is the lack of large-scale labeled speech data from non-native speakers. Additionally, relying only on one aspect (e.g., accuracy) at a phonetic level may not provide a sufficient assessment of pronunciation quality and L2 intelligibility. It is possible to leverage segmental/phonetic-level features such as goodness of pronunciation (GOP), however, feature granularity may cause a discrepancy in prosodic-level (suprasegmental) pronunciation assessment. In this study, Wav2vec 2.0-based MDD and Goodness Of Pronunciation feature-based Transformer are employed to characterize L2 intelligibility. Here, an L2 speech dataset, with human-annotated prosodic (suprasegmental) labels, is used for multi-granular and multi-aspect pronunciation assessment and identification of factors important for intelligibility in L2 English speech. The study provides a transformative comparative assessment of automated pronunciation scores versus the relationship between suprasegmental features and listener perceptions, which taken collectively can help support the development of instantaneous assessment tools and solutions for L2 learners. 
    more » « less
  2. We present an initial web-based tool for St. Lawrence Island/Central Siberian Yupik, an endangered language of Alaska and Russia. This work is supported by the local language community on St. Lawrence Island, and includes an orthographic utility to convert from standard Latin orthography into a fully transparent representation, a preliminary spell checker, a Latin-to-Cyrillic transliteration tool, and a preliminary Cyrillic-to-Latin transliteration tool. Also included is a utility to convert from standard Latin orthography into both IPA and Americanist phonetic notation. Our utility is also capable of explicitly marking syllable boundaries and stress in the standard Latin orthography using the conventions of Jacobson (2001), as well as in Cyrillic and in standard IPA notation. These tools are designed to facilitate the digitization of existing Yupik resources, facilitate additional linguistic field work, and most importantly, bolster efforts by the local Yupik communities in the U.S. and in Russia to promote Yupik usage and literacy, especially among Yupik youth. 
    more » « less
  3. Listeners draw on their knowledge of phonetic categories when identifying speech sounds, extracting meaningful structural features from auditory cues. We use a Bayesian model to investigate the extent to which their perceptions of linguistic content incorporate their full knowledge of the phonetic category structure, or only certain aspects of this knowledge. Simulations show that listeners are best modeled as attending primarily to the most salient phonetic feature of a category when interpreting a cue, possibly attending to other features only in cases of high ambiguity. These results support the conclusion that listeners ignore potentially informative correlations in favor of efficient communication. 
    more » « less
  4. Abstract Research has suggested that children who speak African American English (AAE) have difficulty using features produced in Mainstream American English (MAE) but not AAE, to comprehend sentences in MAE. However, past studies mainly examined dialect features, such as verbal -s , that are produced as final consonants with shorter durations when produced in conversation which impacts their phonetic saliency. Therefore, it is unclear if previous results are due to the phonetic saliency of the feature or how AAE speakers process MAE dialect features more generally. This study evaluated if there were group differences in how AAE- and MAE-speaking children used the auxiliary verbs was and were, a dialect feature with increased phonetic saliency but produced differently between the dialects, to interpret sentences in MAE. Participants aged 6, 5–10, and 0 years, who spoke MAE or AAE, completed the DELV-ST, a vocabulary measure (PVT), and a sentence comprehension task. In the sentence comprehension task, participants heard sentences in MAE that had either unambiguous or ambiguous subjects. Sentences with ambiguous subjects were used to evaluate group differences in sentence comprehension. AAE-speaking children were less likely than MAE-speaking children to use the auxiliary verbs was and were to interpret sentences in MAE. Furthermore, dialect density was predictive of Black participant’s sensitivity to the auxiliary verb. This finding is consistent with how the auxiliary verb is produced between the two dialects: was is used to mark both singular and plural subjects in AAE, while MAE uses was for singular and were for plural subjects. This study demonstrated that even when the dialect feature is more phonetically salient, differences between how verb morphology is produced in AAE and MAE impact how AAE-speaking children comprehend MAE sentences. 
    more » « less
  5. Abstract Psycholinguistic research on children's early language environments has revealed many potential challenges for language acquisition. One is that in many cases, referents of linguistic expressions are hard to identify without prior knowledge of the language. Likewise, the speech signal itself varies substantially in clarity, with some productions being very clear, and others being phonetically reduced, even to the point of uninterpretability. In this study, we sought to better characterize the language‐learning environment of American English‐learning toddlers by testing how well phonetic clarity and referential clarity align in infant‐directed speech. Using an existing Human Simulation Paradigm (HSP) corpus with referential transparency measurements and adding new measures of phonetic clarity, we found that the phonetic clarity of words’ first mentions significantly predicted referential clarity (how easy it was to guess the intended referent from visual information alone) at that moment. Thus, when parents’ speech was especially clear, the referential semantics were also clearer. This suggests that young children could use the phonetics of speech to identify globally valuable instances that support better referential hypotheses, by homing in on clearer instances and filtering out less‐clear ones. Such multimodal “gems” offer special opportunities for early word learning. Research HighlightsIn parent‐infant interaction, parents’ referential intentions are sometimes clear and sometimes unclear; likewise, parents’ pronunciation is sometimes clear and sometimes quite difficult to understand.We find that clearer referential instances go along with clearer phonetic instances, more so than expected by chance.Thus, there are globally valuable instances (“gems”) from which children could learn about words’ pronunciations and words’ meanings at the same time.Homing in on clear phonetic instances and filtering out less‐clear ones would help children identify these multimodal “gems” during word learning. 
    more » « less