skip to main content

Title: Bouncing the network: A dynamical systems model of auditory–vestibular interactions underlying infants’ perception of musical rhythm

Previous work suggests that auditory–vestibular interactions, which emerge during bodily movement to music, can influence the perception of musical rhythm. In a seminal study on the ontogeny of musical rhythm, Phillips‐Silver and Trainor (2005) found that bouncing infants to an unaccented rhythm influenced infants’ perceptual preferences for accented rhythms that matched the rate of bouncing. In the current study, we ask whether nascent, diffuse coupling between auditory and motor systems is sufficient to bootstrap short‐term Hebbian plasticity in the auditory system and explain infants’ preferences for accented rhythms thought to arise from auditory–vestibular interactions. First, we specify a nonlinear, dynamical system in which two oscillatory neural networks, representing developmentally nascent auditory and motor systems, interact through weak, non‐specific coupling. The auditory network was equipped with short‐term Hebbian plasticity, allowing the auditory network to tune its intrinsic resonant properties. Next, we simulate the effect of vestibular input (e.g., infant bouncing) on infants’ perceptual preferences for accented rhythms. We found that simultaneous auditory–vestibular training shaped the model's response to musical rhythm, enhancing vestibular‐related frequencies in auditory‐network activity. Moreover, simultaneous auditory–vestibular training, relative to auditory‐ or vestibular‐only training, facilitated short‐term auditory plasticity in the model, producing stronger oscillator connections in the auditory network. Finally, when tested on a musical rhythm, models which received simultaneous auditory–vestibular training, but not models that received auditory‐ or vestibular‐only training, resonated strongly at frequencies related to their “bouncing,” a finding qualitatively similar to infants’ preferences for accented rhythms that matched the rate of infant bouncing.

more » « less
Award ID(s):
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Date Published:
Journal Name:
Developmental Science
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Rhythm plays an important role in language perception and learning, with infants perceiving rhythmic differences across languages at birth. While the mechanisms underlying rhythm perception in speech remain unclear, one interesting possibility is that these mechanisms are similar to those involved in the perception of musical rhythm. In this work, we adopt a model originally designed for musical rhythm to simulate speech rhythm perception. We show that this model replicates the behavioral results of language discrimination in newborns, and outperforms an existing model of infant language discrimination. We also find that percussives — fast-changing components in the acoustics — are necessary for distinguishing languages of different rhythms, which suggests that percussives are essential for rhythm perception. Our music-inspired model of speech rhythm may be seen as a first step towards a unified theory of how rhythm is represented in speech and music. 
    more » « less
  2. Neural entrainment to musical rhythm is thought to underlie the perception and production of music. In aging populations, the strength of neural entrainment to rhythm has been found to be attenuated, particularly during attentive listening to auditory streams. However, previous studies on neural entrainment to rhythm and aging have often employed artificial auditory rhythms or limited pieces of recorded, naturalistic music, failing to account for the diversity of rhythmic structures found in natural music. As part of larger project assessing a novel music-based intervention for healthy aging, we investigated neural entrainment to musical rhythms in the electroencephalogram (EEG) while participants listened to self-selected musical recordings across a sample of younger and older adults. We specifically measured neural entrainment to the level of musical pulse—quantified here as the phase-locking value (PLV)—after normalizing the PLVs to each musical recording’s detected pulse frequency. As predicted, we observed strong neural phase-locking to musical pulse, and to the sub-harmonic and harmonic levels of musical meter. Overall, PLVs were not significantly different between older and younger adults. This preserved neural entrainment to musical pulse and rhythm could support the design of music-based interventions that aim to modulate endogenous brain activity via self-selected music for healthy cognitive aging. 
    more » « less
  3. The extent that articulatory information embedded in incoming speech contributes to the formation of new perceptual categories for speech sounds has been a matter of discourse for decades. It has been theorized that the acquisition of new speech sound categories requires a network of sensory and speech motor cortical areas (the “dorsal stream”) to successfully integrate auditory and articulatory information. However, it is possible that these brain regions are not sensitive specifically to articulatory information, but instead are sensitive to the abstract phonological categories being learned. We tested this hypothesis by training participants over the course of several days on an articulable non-native speech contrast and acoustically matched inarticulable nonspeech analogues. After reaching comparable levels of proficiency with the two sets of stimuli, activation was measured in fMRI as participants passively listened to both sound types. Decoding of category membership for the articulable speech contrast alone revealed a series of left and right hemisphere regions outside of the dorsal stream that have previously been implicated in the emergence of non-native speech sound categories, while no regions could successfully decode the inarticulable nonspeech contrast. Although activation patterns in the left inferior frontal gyrus (IFG), the middle temporal gyrus (MTG), and the supplementary motor area (SMA) provided better information for decoding articulable (speech) sounds compared to the inarticulable (sine wave) sounds, the finding that dorsal stream regions do not emerge as good decoders of the articulable contrast alone suggests that other factors, including the strength and structure of the emerging speech categories are more likely drivers of dorsal stream activation for novel sound learning. 
    more » « less
  4. Infant-robot interaction has been increasingly gaining attention, yet, there are limited studies on the development of robot-assisted environments that promote perceptual-motor development in infants. This paper assesses the feasibility of operating a spherical mobile robot, Sphero, to engage infants in perceptual-motor exploration of an open area. Two case scenarios were considered. In the first case, Sphero was the only robot providing stimuli in the environment. In the second case, two additional robots provided stimuli along with Sphero. Pilot data from two infants were analyzed to extract information on their visual attention to and physical interaction with Sphero, as well as their motor actions. Overall, infants (i) expressed a preference to Sphero regardless of stimulation levels, and (ii) moved out of stationary postures in an effort to chase and approach Sphero. These preliminary findings provide support for the future implementation of Sphero in robot-assisted learning environments to promote perceptual-motor development in infants. 
    more » « less
  5. Prosody perception is fundamental to spoken language communication as it supports comprehension, pragmatics, morphosyntactic parsing of speech streams, and phonological awareness. A particular aspect of prosody: perceptual sensitivity to speech rhythm patterns in words (i.e., lexical stress sensitivity), is also a robust predictor of reading skills, though it has received much less attention than phonological awareness in the literature. Given the importance of prosody and reading in educational outcomes, reliable and valid tools are needed to conduct large-scale health and genetic investigations of individual differences in prosody, as groundwork for investigating the biological underpinnings of the relationship between prosody and reading. Motivated by this need, we present the Test of Prosody via Syllable Emphasis (“TOPsy”) and highlight its merits as a phenotyping tool to measure lexical stress sensitivity in as little as 10 min, in scalable internet-based cohorts. In this 28-item speech rhythm perception test [modeled after the stress identification test from Wade-Woolley (2016) ], participants listen to multi-syllabic spoken words and are asked to identify lexical stress patterns. Psychometric analyses in a large internet-based sample shows excellent reliability, and predictive validity for self-reported difficulties with speech-language, reading, and musical beat synchronization. Further, items loaded onto two distinct factors corresponding to initially stressed vs. non-initially stressed words. These results are consistent with previous reports that speech rhythm perception abilities correlate with musical rhythm sensitivity and speech-language/reading skills, and are implicated in reading disorders (e.g., dyslexia). We conclude that TOPsy can serve as a useful tool for studying prosodic perception at large scales in a variety of different settings, and importantly can act as a validated brief phenotype for future investigations of the genetic architecture of prosodic perception, and its relationship to educational outcomes. 
    more » « less