skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM ET on Friday, February 6 until 10:00 AM ET on Saturday, February 7 due to maintenance. We apologize for the inconvenience.


Title: Co-speech gestures influence the magnitude and stability of articulatory movements: evidence for coupling-based enhancement
Humans rarely speak without producing co-speech gestures of the hands, head, and other parts of the body. Co-speech gestures are also highly restricted in how they are timed with speech, typically synchronizing with prosodically-prominent syllables. What functional principles underlie this relationship? Here, we examine how the production of co-speech manual gestures influences spatiotemporal patterns of the oral articulators during speech production. We provide novel evidence that words uttered with accompanying co-speech gestures are produced with more extreme tongue and jaw displacement, and that presence of a co-speech gesture contributes to greater temporal stability of oral articulatory movements. This effect–which we term coupling enhancement–differs from stress-based hyperarticulation in that differences in articulatory magnitude are not vowel-specific in their patterning. Speech and gesture synergies therefore constitute an independent variable to consider when modeling the effects of prosodic prominence on articulatory patterns. Our results are consistent with work in language acquisition and speech-motor control suggesting that synchronizing speech to gesture can entrain acoustic prominence.  more » « less
Award ID(s):
2306149
PAR ID:
10644459
Author(s) / Creator(s):
; ;
Publisher / Repository:
Nature Portfolio
Date Published:
Journal Name:
Scientific Reports
Volume:
15
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Co-speech gestures are timed to occur with prosodically prominent syllables in several languages. In prior work in Indo-European languages, gestures are found to be attracted to stressed syllables, with gesture apexes preferentially aligning with syllables bearing higher and more dynamic pitch accents. Little research has examined the temporal alignment of co-speech gestures in African tonal languages, where metrical prominence is often hard to identify due to a lack of canonical stress correlates, and where a key function of pitch is in distinguishing between words, rather than marking intonational prominence. Here, we examine the alignment of co-speech gestures in two different Niger-Congo languages with very different word structures, Medʉmba (Grassfields Bantu, Cameroon) and Igbo (Igboid, Nigeria). Our findings suggest that the initial position in the stem tends to attract gestures in Medʉmba, while the final syllable in the word is the default position for gesture alignment in Igbo; phrase position also influences gesture alignment, but in language-specific ways. Though neither language showed strong evidence of elevated prominence of any individual tone value, gesture patterning in Igbo suggests that metrical structure at the level of the tonal foot is relevant to the speech-gesture relationship. Our results demonstrate how the speech-gesture relationship can be a window into patterns of word- and phrase-level prosody cross-linguistically. They also show that the relationship between gesture and tone (and the related notion of ‘tonal prominence’) is mediated by tone’s function in a language.  
    more » « less
  2. null (Ed.)
    This research establishes a better understanding of the syntax choices in speech interactions and of how speech, gesture, and multimodal gesture and speech interactions are produced by users in unconstrained object manipulation environments using augmented reality. The work presents a multimodal elicitation study conducted with 24 participants. The canonical referents for translation, rotation, and scale were used along with some abstract referents (create, destroy, and select). In this study time windows for gesture and speech multimodal interactions are developed using the start and stop times of gestures and speech as well as the stoke times for gestures. While gestures commonly precede speech by 81 ms we find that the stroke of the gesture is commonly within 10 ms of the start of speech. Indicating that the information content of a gesture and its co-occurring speech are well aligned to each other. Lastly, the trends across the most common proposals for each modality are examined. Showing that the disagreement between proposals is often caused by a variation of hand posture or syntax. Allowing us to present aliasing recommendations to increase the percentage of users' natural interactions captured by future multimodal interactive systems. 
    more » « less
  3. Stressed syllables in languages which have them tend to show two interesting properties: They show patterns of phonetic ‘enhancement’ at the articulatory and acoustic levels, and they also show coordinative properties. They typically play a key role in coordinating speech with co-speech gesture, in coordination with a musical beat, and in other sensorimotor synchronization tasks such as speech-coordinated beat tapping and metronome timing. While various phonological theories have considered stress from both of these perspectives, there is as yet no clear explanation as to how these properties relate to one another. The present work tests the hypothesis that aspects of phonetic enhancement may in fact be driven by coordination itself by observing how phonetic patterns produced by speakers of two prosodically-distinct languages—English and Medʉmba (Grassfields Bantu)—vary as a function of timing relations with an imaginary metronome beat. Results indicate that production of syllables in time (versus on the ‘offbeat’) with the imaginary beat led to increased duration and first formant frequency—two widely observed correlates of syllable stress—for speakers of both languages. These results support the idea that some patterns of phonetic enhancement may have their roots in coordinative practices.  
    more » « less
  4. Skarnitzl, R. & (Ed.)
    The timing of both manual co-speech gestures and head gestures is sensitive to prosodic structure of speech. However, head gesters are used not only by speakers, but also by listeners as a backchanneling device. Little research exists on the timing of gestures in back-channeling. To address this gap, we compare timing of listener and speaker head gestures in an interview context. Results reveal the dual role that head gestures play in speech and conversational interaction: while they are coordinated in key ways to one’s own speech, they are also coordinated to the gestures (and hence, the speech) of a conversation partner when one is actively listening to them. We also show that head gesture timing is sensitive to social dynamics between interlocutors. This study provides a novel contribution to literature on head gesture timing and has implications for studies of discourse and accommodation. 
    more » « less
  5. Current systems that use gestures to enable storytelling tend to mostly rely on a pre-scripted set of gestures or the use of manipulative gestures with respect to tangibles. Our research aims to inform the design of gesture recognition systems for storytelling with implications derived from a feature-based analysis of iconic gestures that occur during naturalistic oral storytelling. We collected story retellings of a collection of cartoon stimuli from 20 study participants, and a gesture analysis was performed on videos of the story retellings focusing on iconic gestures. Iconic gestures are a type of representational gesture that provides information about objects such as their shape, location, or movement. The form features of the iconic gestures were analyzed with respect to the concepts that they portrayed. Patterns between the two were identified and used to create recommendations for patterns in gesture form a system could be primed to recognize. 
    more » « less