skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Performance on stochastic figure-ground perception varies with individual differences in speech-in-noise recognition and working memory capacity
Speech recognition in noisy environments can be challenging and requires listeners to accurately segregate a target speaker from irrelevant background noise. Stochastic figure-ground (SFG) tasks in which temporally coherent inharmonic pure-tones must be identified from a background have been used to probe the non-linguistic auditory stream segregation processes important for speech-in-noise processing. However, little is known about the relationship between performance on SFG tasks and speech-in-noise tasks nor the individual differences that may modulate such relationships. In this study, 37 younger normal-hearing adults performed an SFG task with target figure chords consisting of four, six, eight, or ten temporally coherent tones amongst a background of randomly varying tones. Stimuli were designed to be spectrally and temporally flat. An increased number of temporally coherent tones resulted in higher accuracy and faster reaction times (RTs). For ten target tones, faster RTs were associated with better scores on the Quick Speech-in-Noise task. Individual differences in working memory capacity and self-reported musicianship further modulated these relationships. Overall, results demonstrate that the SFG task could serve as an assessment of auditory stream segregation accuracy and RT that is sensitive to individual differences in cognitive and auditory abilities, even among younger normal-hearing adults.  more » « less
Award ID(s):
2020624
PAR ID:
10484682
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Publisher / Repository:
The Journal of the Acoustical Society of America
Date Published:
Journal Name:
The Journal of the Acoustical Society of America
Volume:
153
Issue:
1
ISSN:
0001-4966
Page Range / eLocation ID:
286 to 303
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Numerous studies have suggested that the perception of a target sound stream (or source) can only be segregated from a complex acoustic background mixture if the acoustic features underlying its perceptual attributes (e.g., pitch, location, and timbre) induce temporally modulated responses that are mutually correlated (or coherent), and that are uncorrelated (incoherent) from those of other sources in the mixture. This “temporal coherence” hypothesis asserts that attentive listening to one acoustic feature of a target enhances brain responses to that feature but would also concomitantly (1) induce mutually excitatory influences with other coherently responding neurons, thus enhancing (or binding) them all as they respond to the attended source; by contrast, (2) suppressive interactions are hypothesized to build up among neurons driven by temporally incoherent sound features, thus relatively reducing their activity. In this study, we report on EEG measurements in human subjects engaged in various sound segregation tasks that demonstrate rapid binding among the temporally coherent features of the attended source regardless of their identity (pure tone components, tone complexes, or noise), harmonic relationship, or frequency separation, thus confirming the key role temporal coherence plays in the analysis and organization of auditory scenes. 
    more » « less
  2. Listening to speech in noise can require substantial mental effort, even among younger normal-hearing adults. The task-evoked pupil response (TEPR) has been shown to track the increased effort exerted to recognize words or sentences in increasing noise. However, few studies have examined the trajectory of listening effort across longer, more natural, stretches of speech, or the extent to which expectations about upcoming listening difficulty modulate the TEPR. Seventeen younger normal-hearing adults listened to 60-s-long audiobook passages, repeated three times in a row, at two different signal-to-noise ratios (SNRs) while pupil size was recorded. There was a significant interaction between SNR, repetition, and baseline pupil size on sustained listening effort. At lower baseline pupil sizes, potentially reflecting lower attention mobilization, TEPRs were more sustained in the harder SNR condition, particularly when attention mobilization remained low by the third presentation. At intermediate baseline pupil sizes, differences between conditions were largely absent, suggesting these listeners had optimally mobilized their attention for both SNRs. Lastly, at higher baseline pupil sizes, potentially reflecting over-mobilization of attention, the effect of SNR was initially reversed for the second and third presentations: participants initially appeared to disengage in the harder SNR condition, resulting in reduced TEPRs that recovered in the second half of the story. Together, these findings suggest that the unfolding of listening effort over time depends critically on the extent to which individuals have successfully mobilized their attention in anticipation of difficult listening conditions. 
    more » « less
  3. Speech sounds exist in a complex acoustic–phonetic space, and listeners vary in the extent to which they are sensitive to variability within the speech sound category (“gradience”) and the degree to which they show stable, consistent responses to phonetic stimuli. Here, we investigate the hypothesis that individual differences in the perception of the sound categories of one's language may aid speech-in-noise performance across the adult lifespan. Declines in speech-in-noise performance are well documented in healthy aging, and are, unsurprisingly, associated with differences in hearing ability. Nonetheless, hearing status and age are incomplete predictors of speech-in-noise performance, and long-standing research suggests that this ability draws on more complex cognitive and perceptual factors. In this study, a group of adults ranging in age from 18 to 67 years performed online assessments designed to measure phonetic category sensitivity, questionnaires querying recent noise exposure history and demographic factors, and crucially, a test of speech-in-noise perception. Results show that individual differences in the perception of two consonant contrasts significantly predict speech-in-noise performance, even after accounting for age and recent noise exposure history. This finding supports the hypothesis that individual differences in sensitivity to phonetic categories mediates speech perception in challenging listening situations. 
    more » « less
  4. Introduction Using data collected from hearing aid users’ own hearing aids could improve the customization of hearing aid processing for different users based on the auditory environments they encounter in daily life. Prior studies characterizing hearing aid users’ auditory environments have focused on mean sound pressure levels and proportions of environments based on classifications. In this study, we extend these approaches by introducing entropy to quantify the diversity of auditory environments hearing aid users encounter. Materials and Methods Participants from 4 groups (younger listeners with normal hearing and older listeners with hearing loss from an urban or rural area) wore research hearing aids and completed ecological momentary assessments on a smartphone for 1 week. The smartphone was programmed to sample the processing state (input sound pressure level and environment classification) of the hearing aids every 10 min and deliver an ecological momentary assessment every 40 min. Entropy values for sound pressure levels, environment classifications, and ecological momentary assessment responses were calculated for each participant to quantify the diversity of auditory environments encountered over the course of the week. Entropy values between groups were compared. Group differences in entropy were compared to prior work reporting differences in mean sound pressure levels and proportions of environment classifications. Group differences in entropy measured objectively from the hearing aid data were also compared to differences in entropy measured from the self-report ecological momentary assessment data. Results Auditory environment diversity, quantified using entropy from the hearing aid data, was significantly higher for younger listeners than older listeners. Entropy measured using ecological momentary assessment was also significantly higher for younger listeners than older listeners. Discussion Using entropy, we show that younger listeners experience a greater diversity of auditory environments than older listeners. Alignment of group entropy differences with differences in sound pressure levels and hearing aid feature activation previously reported, along with alignment with ecological momentary response entropy, suggests that entropy is a valid and useful metric. We conclude that entropy is a simple and intuitive way to measure auditory environment diversity using hearing aid data. 
    more » « less
  5. Spectrotemporal modulations (STM) are essential features of speech signals that make them intelligible. While their encoding has been widely investigated in neurophysiology, we still lack a full understanding of how STMs are processed at the behavioral level and how cochlear hearing loss impacts this processing. Here, we introduce a novel methodological framework based on psychophysical reverse correlation deployed in the modulation space to characterize the mechanisms underlying STM detection in noise. We derive perceptual filters for young normal-hearing and older hearing-impaired individuals performing a detection task of an elementary target STM (a given product of temporal and spectral modulations) embedded in other masking STMs. Analyzed with computational tools, our data show that both groups rely on a comparable linear (band-pass)–nonlinear processing cascade, which can be well accounted for by a temporal modulation filter bank model combined with cross-correlation against the target representation. Our results also suggest that the modulation mistuning observed for the hearing-impaired group results primarily from broader cochlear filters. Yet, we find idiosyncratic behaviors that cannot be captured by cochlear tuning alone, highlighting the need to consider variability originating from additional mechanisms. Overall, this integrated experimental-computational approach offers a principled way to assess suprathreshold processing distortions in each individual and could thus be used to further investigate interindividual differences in speech intelligibility. 
    more » « less