skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 2042702

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Machine Learning Facilitated Investigations of Intonational Meaning: Prosodic Cues to Epistemic Shifts in American English Utterances Authors: Veilleux, Shattuck-Hufnagel, Jeong, Brugos, Ahn This work analyzes experimentally elicited speech to capture the relationship between prosody and semantic/pragmatic meanings. Production prompts were comicstrips where contexts were manipulated along axes prominently discussed in sem/prag literature. Participants were tasked with reading lines as the speaker would, uttering a target phrase communicating a proposition p (e.g., “only marble is available”) to a hearer who had epistemic authority on p. Prompts varied whether the speaker’s initial belief (prior bias) was confirmed (condition A: bias=p) or corrected (condition B: bias=¬p); this meaning difference was reinforced by response particles (A: “okay so” vs. B: “oh really”) preceding the target phrase. Over 475 productions were annotated with phonologically-informed phonetic labels (PoLaR). To model many-to-many mappings between features (prosodic form) and classification (sem/prag meaning), Random Forests were designed on labels and derived measures (including f0 ranges, slopes, TCoG) from 299 recordings — classifying meaning with high accuracy (>85%). RFs identified condition-distinguishing prosodic cues in both response particle and target phrases, leading to questions of how/whether functionally-overlapping lexical content might affect prosodic realization. Moreover, RFs identified phrase-final f0 as important, leading to deeper edge-tone explorations. These highlight how explanatory ML models can help iteratively improve targeted analysis. 
    more » « less
  2. This study provides a proof-of-concept for a new method for analyzing intonational form and meaning, demonstrated by analysis of mirative utterances in American English. Here, K-means clustering using measures derived from PoLaR labels (i.e., TCoG) revealed emergent clusters of pitch accents that are suggestive of familiar phonological categories (e.g., MAE_ToBI L+H*). A Random Forest analysis then classified utterance-level meaning based on measures from both smaller granularity (related to individual pitch accents) and larger granularity (related to utterance level meaning), showing >85% correct categorization of exclamative vs filler sentences. This work has implications for how to model mappings between prosody and meaning, especially where existing phonological categories alone don’t identify semantic/pragmatic categories. 
    more » « less
  3. Phrase-level prosodic prominence in American English is understood, in the AM tradition, to be marked by pitch accents. While such prominences are characterized via tonal labels in ToBI (e.g. H*), their cues are not exclusively in the pitch domain: timing, loudness and voice quality are known to contribute to prominence perception. All of these cues occur with a wide degree of variability in naturally produced speech, and this variation may be informative. In this study, we advance towards a system of explicit labelling of individual cues to prosodic structure, here focusing on phrase-level prominence. We examine correlations between the presence of a set of 6 cues to prominence (relating to segment duration, loudness, and non-modal phonation, in addition to f0) and pitch accent labels in a corpus of ToBI-labelled American English speech. Results suggest that tokens with more cues are more likely to receive a pitch accent label. 
    more » « less