skip to main content


Title: Towards real-world objective speech quality and intelligibility assessment using speech-enhancement residuals and convolutional long short-term memory networks
Award ID(s):
1755844
NSF-PAR ID:
10287702
Author(s) / Creator(s):
;
Date Published:
Journal Name:
The Journal of the Acoustical Society of America
Volume:
148
Issue:
5
ISSN:
0001-4966
Page Range / eLocation ID:
3348 to 3359
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. When listening to speech, our brain responses time lock to acoustic events in the stimulus. Recent studies have also reported that cortical responses track linguistic representations of speech. However, tracking of these representations is often described without controlling for acoustic properties. Therefore, the response to these linguistic representations might reflect unaccounted acoustic processing rather than language processing. Here, we evaluated the potential of several recently proposed linguistic representations as neural markers of speech comprehension. To do so, we investigated EEG responses to audiobook speech of 29 participants (22 females). We examined whether these representations contribute unique information over and beyond acoustic neural tracking and each other. Indeed, not all of these linguistic representations were significantly tracked after controlling for acoustic properties. However, phoneme surprisal, cohort entropy, word surprisal, and word frequency were all significantly tracked over and beyond acoustic properties. We also tested the generality of the associated responses by training on one story and testing on another. In general, the linguistic representations are tracked similarly across different stories spoken by different readers. These results suggests that these representations characterize the processing of the linguistic content of speech. SIGNIFICANCE STATEMENT For clinical applications, it would be desirable to develop a neural marker of speech comprehension derived from neural responses to continuous speech. Such a measure would allow for behavior-free evaluation of speech understanding; this would open doors toward better quantification of speech understanding in populations from whom obtaining behavioral measures may be difficult, such as young children or people with cognitive impairments, to allow better targeted interventions and better fitting of hearing devices. 
    more » « less
  2. The effect of training on linguistic release from masking (LRM) was examined. In a pre-test and post-test, English monolingual listeners transcribed sentences presented with English and Dutch maskers. During training, participants transcribed sentences with either Dutch, English, or white noise maskers and received feedback. LRM was evident in the pre-test (performance was better with Dutch maskers) but was eliminated after training (masker conditions did not differ). Thus, the informational masking driving LRM can be ameliorated through training. This study is a basis for future research examining the specific aspects of informational masking that change as a function of experience. 
    more » « less