skip to main content

Title: Delayed Auditory Feedback Elicits Specific Patterns of Serial Order Errors in a Paced Syllable Sequence Production Task
Purpose: Delayed auditory feedback (DAF) interferes with speech output. DAF causes distorted and disfluent productions and errors in the serial order of produced sounds. Although DAF has been studied extensively, the specific patterns of elicited speech errors are somewhat obscured by relatively small speech samples, differences across studies, and uncontrolled variables. The goal of this study was to characterize the types of serial order errors that increase under DAF in a systematic syllable sequence production task, which used a closed set of sounds and controlled for speech rate. Method: Sixteen adult speakers repeatedly produced CVCVCV (C = consonant, V = vowel) sequences, paced to a “visual metronome,” while hearing self-generated feedback with delays of 0–250 ms. Listeners transcribed recordings, and speech errors were classified based on the literature surrounding naturally occurring slips of the tongue. A series of mixed-effects models were used to assess the effects of delay for different error types, for error arrival time, and for speaking rate. Results: DAF had a significant effect on the overall error rate for delays of 100 ms or greater. Statistical models revealed significant effects (relative to zero delay) for vowel and syllable repetitions, vowel exchanges, vowel omissions, onset disfluencies, and distortions. more » Serial order errors were especially dominated by vowel and syllable repetitions. Errors occurred earlier on average within a trial for longer feedback delays. Although longer delays caused slower speech, this effect was mediated by the run number (time in the experiment) and small compared with those in previous studies. Conclusions: DAF drives a specific pattern of serial order errors. The dominant pattern of vowel and syllable repetition errors suggests possible mechanisms whereby DAF drives changes to the activity in speech planning representations, yielding errors. These mechanisms are outlined with reference to the GODIVA (Gradient Order Directions Into Velocities of Articulators) model of speech planning and production. Supplemental Material: https://doi.org/10.23641/asha.19601785 « less
Authors:
; ; ; ;
Award ID(s):
2029245
Publication Date:
NSF-PAR ID:
10338801
Journal Name:
Journal of Speech, Language, and Hearing Research
Volume:
65
Issue:
5
Page Range or eLocation-ID:
1800 to 1821
ISSN:
1092-4388
Sponsoring Org:
National Science Foundation
More Like this
  1. The way listeners perceive speech sounds is largely determined by the language(s) they were exposed to as a child. For example, native speakers of Japanese have a hard time discriminating between American English /ɹ/ and /l/, a phonetic contrast that has no equivalent in Japanese. Such effects are typically attributed to knowledge of sounds in the native language, but quantitative models of how these effects arise from linguistic knowledge are lacking. One possible source for such models is Automatic Speech Recognition (ASR) technology. We implement models based on two types of systems from the ASR literature—hidden Markov models (HMMs) and the more recent, and more accurate, neural network systems—and ask whether, in addition to showing better performance, the neural network systems also provide better models of human perception. We find that while both types of systems can account for Japanese natives’ difficulty with American English /ɹ/ and /l/, only the neural network system successfully accounts for Japanese natives’ facility with Japanese vowel length contrasts. Our work provides a new example, in the domain of speech perception, of an often observed correlation between task performance and similarity to human behavior.
  2. Bilinguals occasionally produce language intrusion errors (inadvertent translations of the intended word), especially when attempting to produce function word targets, and often when reading aloud mixed-language paragraphs. We investigate whether these errors are due to a failure of attention during speech planning, or failure of monitoring speech output by classifying errors based on whether and when they were corrected, and investigating eye movement behaviour surrounding them. Prior research on this topic has primarily tested alphabetic languages (e.g., Spanish–English bilinguals) in which part of speech is confounded with word length, which is related to word skipping (i.e., decreased attention). Therefore, we tested 29 Chinese–English bilinguals whose languages differ in orthography, visually cueing language membership, and for whom part of speech (in Chinese) is less confounded with word length. Despite the strong orthographic cue, Chinese–English bilinguals produced intrusion errors with similar effects as previously reported (e.g., especially with function word targets written in the dominant language). Gaze durations did differ by whether errors were made and corrected or not, but these patterns were similar for function and content words and therefore cannot explain part of speech effects. However, bilinguals regressed to words produced as errors more often than to correctly produced words,more »but regressions facilitated correction of errors only for content, not for function words. These data suggest that the vulnerability of function words to language intrusion errors primarily reflects automatic retrieval and failures of speech monitoring mechanisms from stopping function versus content word errors after they are planned for production.

    « less
  3. Acoustic analysis of typically developing elementary school-aged (prepubertal) children’s speech has been primarily performed on cross-sectional data in the past. Few studies have examined longitudinal data in this age group. For this presentation, we analyze the developmental changes in the acoustic properties of children’s speech using data collected longitudinally over four years (from first grade to fourth grade). Four male and four female children participated in this study. Data were collected once every year for each child. Using these data, we measured the four-year development of subglottal acoustics (first two subglottal resonances) and vowel acoustics (first four formants and fundamental frequency). Subglottal acoustic measurements are relatively independent of context, and average values were obtained for each child in each year. Vowel acoustics measurements were made for seven vowels (i, ɪ, ɛ, æ, ʌ, ɑ, u), each occurring in two different words in the stressed syllable. We investigated the correlations between the children’s subglottal acoustics, vowel acoustics, and growth-related variables such as standing height, sitting height, and chronological age. Gender-, vowel-, and child-specific analyses were carried out in order to shed light on how typically developing speech acoustics depend on such variables. [Work supported, in part, by the NSF.]
  4. Motor behavior often occurs in environments with multiple goal options that can vary during the ongoing action. We explored this situation by requiring subjects to select between different target options during an ongoing reach. During split trials the original target was replaced with a left and a right flanking target, and participants had to select between them. This contrasted with the standard jump trials, where the original target would be replaced with a single flanking target, left or right. When participants were instructed to follow their natural tendency, they all tended to select the split target nearest the original. The near-target preference was more prominent with increased spatial disparity between the options and when participants could preview the potential options. Moreover, explicit instruction to obtain the “far” target during split trials resulted many errors compared with a “near” instruction, ~50% vs. ~15%. Online reaction times to target change were delayed in split trials compared with jump trials, ~200 ms vs. ~150 ms, but also highly automatic. Trials in which the instructed far target was correctly obtained were delayed by a further ~50 ms, unlike those in which the near target was incorrectly obtained. We also observed nonspecific responses from armmore »muscles at the jump trial latency during split trials. Taken together, our results indicate that online selection of reach targets is automatically linked to the spatial distribution of the options, though at greater delays than redirecting to a single target. NEW & NOTEWORTHY This work demonstrates that target selection during an ongoing reach is automatically linked to the option nearest a voided target. Online reaction times for two options are longer than redirection to a single option. Attempts to override the near-target tendency result in a high number of errors at the normal delay and further delays when the attempt is successful.« less
  5. ASME (Ed.)
    Research was conducted to determine combustion characteristics such as: ignition delay (ID), combustion delay (CD), combustion phasing (CA 50), combustion duration, derived cetane number (DCN) and ringing intensity (RI) of F24, for its compatibility in Common Rail Direct Injection (CRDI) compression ignition (CI) engine. The first part of this study is investigating the performance of Jet-A, F24, and ultra-low sulfur diesel #2 (ULSD) using a constant volume combustion chamber (CVCC) followed by experiments in a fired CRDI research engine. Investigations of the spray atomization and droplet size distribution of the neat fuels were conducted with a Malvern Mie scattering He-Ne laser. It was found that the average Sauter Mean Diameter (SMD) for Jet-A and F24 are similar, with both fuels SMD droplet range between 25–29 micrometers. Meanwhile, ULSD was found to have a larger SMD particle size in the range of 34–40 micrometers. It was observed during the study, utilizing the CVCC, that the ID and CD for neat ULSD and Jet-A are nearly identical while the combustion of F24 is delayed. F24 was found to have longer durations of both ID and CD by approx. 0.5 ms. This results in a lower DCN for the fuel of 43.5, whereasmore »ULSD and Jet-A have DCNs of 45 and 47 respectively. The peak AHRR for ULSD and Jet-A are nearly identical, whereas F24 has a peak magnitude of approx. 20% lower than ULSD and Jet-A. It was found that both aviation fuels had significantly fewer ringing events occurring after peak high temperature heat release (HTHR), a trend also observed in the CRDI research engine. Neat F24, Jet-A and ULSD were researched in the experimental engine at the same thermodynamic parameters: 5 bar indicated mean effective pressure (IMEP), 50°C (supercharged and EGR) inlet air temperature, 1500 RPM, start of injection (SOI) 16°BTDC, and 800 bar of fuel rail injection pressure as the baseline parameters in order to observe their ignition behavior, low temperature heat release, combustion phasing, and combustion duration. It was found that the ignition delay of F24 and Jet-A was greater than ULSD, approx. 5% for both aviation fuels. This ignition delay also affected the combustion phasing, or CA 50, of the aviation fuels. The CA 50 of the aviation fuels was delayed by approx. 2% compared to ULSD. Jet-A had a nearly identical combustion duration compared to ULSD, however F24 had an extended combustion duration which was approx. 3% longer than that of ULSD and Jet-A. It was discovered with the accumulations of these delays in ID, CD, CA50, that the RI of the aviation fuels were reduced. F24 was discovered to have more delays, and the RI correlates with these results having a 70% reduction in RI compared to ULSD.« less