skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Transformation of speech sequences in human sensorimotor circuits
After we listen to a series of words, we can silently replay them in our mind. Does this mental replay involve a reactivation of our original perceptual dynamics? We recorded electrocorticographic (ECoG) activity across the lateral cerebral cortex as people heard and then mentally rehearsed spoken sentences. For each region, we tested whether silent rehearsal of sentences involved reactivation of sentence-specific representations established during perception or transformation to a distinct representation. In sensorimotor and premotor cortex, we observed reliable and temporally precise responses to speech; these patterns transformed to distinct sentence-specific representations during mental rehearsal. In contrast, we observed less reliable and less temporally precise responses in prefrontal and temporoparietal cortex; these higher-order representations, which were sensitive to sentence semantics, were shared across perception and rehearsal of the same sentence. The mental rehearsal of natural speech involves the transformation of stimulus-locked speech representations in sensorimotor and premotor cortex, combined with diffuse reactivation of higher-order semantic representations.  more » « less
Award ID(s):
1949730
PAR ID:
10148244
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proceedings of the National Academy of Sciences
Volume:
117
Issue:
6
ISSN:
0027-8424
Page Range / eLocation ID:
3203 to 3213
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Little is known about how neural representations of natural sounds differ across species. For example, speech and music play a unique role in human hearing, yet it is unclear how auditory representations of speech and music differ between humans and other animals. Using functional ultrasound imaging, we measured responses in ferrets to a set of natural and spectrotemporally matched synthetic sounds previously tested in humans. Ferrets showed similar lower-level frequency and modulation tuning to that observed in humans. But while humans showed substantially larger responses to natural vs. synthetic speech and music in non-primary regions, ferret responses to natural and synthetic sounds were closely matched throughout primary and non-primary auditory cortex, even when tested with ferret vocalizations. This finding reveals that auditory representations in humans and ferrets diverge sharply at late stages of cortical processing, potentially driven by higher-order processing demands in speech and music. 
    more » « less
  2. Like all domains of cognition, language processing is affected by top–down knowledge. Classic evidence for this is missing blatant errors in the signal. In sentence comprehension, one instance is failing to notice word order errors, such as transposed words in the middle of a sentence: “you that read wrong” (Mirault et al., 2018). Our brains seem to fix such errors, since they are incompatible with our grammatical knowledge, but how do our brains do this? Following behavioral work on inner transpositions, we flashed four-word sentences for 300 ms using rapid parallel visual presentation (Snell and Grainger, 2017). We compared magnetoencephalography responses to fully grammatical and reversed sentences (24 human participants: 21 females, 4 males). The left lateral language cortex robustly distinguished grammatical and reversed sentences starting at 213 ms. Thus, the influence of grammatical knowledge begun rapidly after visual word form recognition (Tarkiainen et al., 1999). At the earliest stage of this neural “sentence superiority effect,” inner transpositions patterned between grammatical and reversed sentences, showing evidence that the brain initially “noticed” the error. However, 100 ms later, inner transpositions became indistinguishable from grammatical sentences, suggesting at this point, the brain had “fixed” the error. These results show that after a single glance at a sentence, syntax impacts our neural activity almost as quickly as higher-level object recognition is assumed to take place (Cichy et al., 2014). The earliest stage involves detailed comparisons between the bottom–up input and grammatical knowledge, while shortly afterward, top–down knowledge can override an error in the stimulus. 
    more » « less
  3. We examined the neural correlates underlying the semantic processing of native- and nonnative-accented sentences, presented in quiet or embedded in multi-talker noise. Implementing a semantic violation paradigm, 36 English monolingual young adults listened to American-accented (native) and Chinese-accented (nonnative) English sentences with or without semantic anomalies, presented in quiet or embedded in multi-talker noise, while EEG was recorded. After hearing each sentence, participants verbally repeated the sentence, which was coded and scored as an offline comprehension accuracy measure. In line with earlier behavioral studies, the negative impact of background noise on sentence repetition accuracy was higher for nonnative-accented than for native-accented sentences. At the neural level, the N400 effect for semantic anomaly was larger for nativeaccented than for nonnative-accented sentences, and was also larger for sentences presented in quiet than in noise, indicating impaired lexical-semantic access when listening to nonnative-accented speech or sentences embedded in noise. No semantic N400 effect was observed for nonnative-accented sentences presented in noise. Furthermore, the frequency of neural oscillations in the alpha frequency band (an index of online cognitive listening effort) was higher when listening to sentences in noise versus in quiet, but no difference was observed across the accent conditions. Semantic anomalies presented in background noise also elicited higher theta activity, whereas processing nonnative-accented anomalies was associated with decreased theta activity. Taken together, we found that listening to nonnative accents or background noise is associated with processing challenges during online semantic access, leading to decreased comprehension accuracy. However, the underlying cognitive mechanism (e.g., associated listening efforts) might manifest differently across accented speech processing and speech in noise processing. 
    more » « less
  4. Given vector representations for individual words, it is necessary to compute vector representations of sentences for many applications in a compositional manner, often using artificial neural networks. Relatively little work has explored the internal structure and properties of such sentence vectors. In this paper, we explore the properties of sentence vectors in the context of automatic summarization. In particular, we show that cosine similarity between sentence vectors and document vectors is strongly correlated with sentence importance and that vector semantics can identify and correct gaps between the sentences chosen so far and the document. In addition, we identify specific dimensions which are linked to effective summaries. To our knowledge, this is the first time specific dimensions of sentence embeddings have been connected to sentence properties. We also compare the features of different methods of sentence embeddings. Many of these insights have applications in uses of sentence embeddings far beyond summarization. 
    more » « less
  5. null (Ed.)
    Aging is associated with an exaggerated representation of the speech envelope in auditory cortex. The relationship between this age-related exaggerated response and a listener’s ability to understand speech in noise remains an open question. Here, information-theory-based analysis methods are applied to magnetoencephalography recordings of human listeners, investigating their cortical responses to continuous speech, using the novel nonlinear measure of phase-locked mutual information between the speech stimuli and cortical responses. The cortex of older listeners shows an exaggerated level of mutual information, compared with younger listeners, for both attended and unattended speakers. The mutual information peaks for several distinct latencies: early (∼50 ms), middle (∼100 ms), and late (∼200 ms). For the late component, the neural enhancement of attended over unattended speech is affected by stimulus signal-to-noise ratio, but the direction of this dependency is reversed by aging. Critically, in older listeners and for the same late component, greater cortical exaggeration is correlated with decreased behavioral inhibitory control. This negative correlation also carries over to speech intelligibility in noise, where greater cortical exaggeration in older listeners is correlated with worse speech intelligibility scores. Finally, an age-related lateralization difference is also seen for the ∼100 ms latency peaks, where older listeners show a bilateral response compared with younger listeners’ right lateralization. Thus, this information-theory-based analysis provides new, and less coarse-grained, results regarding age-related change in auditory cortical speech processing, and its correlation with cognitive measures, compared with related linear measures. NEW & NOTEWORTHY Cortical representations of natural speech are investigated using a novel nonlinear approach based on mutual information. Cortical responses, phase-locked to the speech envelope, show an exaggerated level of mutual information associated with aging, appearing at several distinct latencies (∼50, ∼100, and ∼200 ms). Critically, for older listeners only, the ∼200 ms latency response components are correlated with specific behavioral measures, including behavioral inhibition and speech comprehension. 
    more » « less