The Time-Invariant String Kernel (TISK) model of spoken word recognition (Hannagan, Magnuson & Grainger, 2013; You & Magnuson, 2018) is an interactive activation model with many similarities to TRACE (McClelland & Elman, 1986). However, by replacing most time-specific nodes in TRACE with time-invariant open-diphone nodes, TISK uses orders of magnitude fewer nodes and connections than TRACE. Although TISK performed remarkably similarly to TRACE in simulations reported by Hannagan et al., the original TISK implementation did not include lexical feedback, precluding simulation of top-down effects, and leaving open the possibility that adding feedback to TISK might fundamentally alter its performance. Here, we demonstrate that when lexical feedback is added to TISK, it gains the ability to simulate top-down effects without losing the ability to simulate the fundamental phenomena tested by Hannagan et al. Furthermore, with feedback, TISK demonstrates graceful degradation when noise is added to input, although parameters can be found that also promote (less) graceful degradation without feedback. We review arguments for and against feedback in cognitive architectures, and conclude that feedback provides a computationally efficient basis for robust constraint-based processing.
more »
« less
Feedback in the Time-Invariant String Kernel model of spoken word recognition
The Time-Invariant String Kernel (TISK) model of spoken word recognition (Hanngan et al., 2013) is an interactive activation model like TRACE (McClelland & Elman, 1986). However, it uses orders of magnitude fewer nodes and connections because it replaces TRACE's time-specific duplicates of phoneme and word nodes with time-invariant nodes based on a string kernel representation (essentially a phoneme-by-phoneme matrix, where a word is encoded as by all ordered open diphones it contains; e.g., cat has /kæ/, /æt/, and /kt/). Hannagan et al. (2013) showed that TISK behaves similarly to TRACE in the time course of phonological competition and even word-specific recognition times. However, the original implementation did not include feedback from words to diphone nodes, precluding simulation of top-down effects. Here, we demonstrate that TISK can be easily adapted to lexical feedback, affording simulation of top-down effects as well as allowing the model to demonstrate graceful degradation given noisy inputs.
more »
« less
- PAR ID:
- 10097512
- Date Published:
- Journal Name:
- Proceedings of the Cognitive Science Society
- Page Range / eLocation ID:
- 732-737
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Whether top-down feedback modulates perception has deep implications for cognitive theories. Debate has been vigorous in the domain of spoken word recognition, where competing computational models and agreement on at least one diagnostic experimental paradigm suggest that the debate may eventually be resolvable. Norris and Cutler (2021) revisit arguments against lexical feedback in spoken word recognition models. They also incorrectly claim that recent computational demonstrations that feedback promotes accuracy and speed under noise (Magnuson et al., 2018) were due to the use of the Luce choice rule rather than adding noise to inputs (noise was in fact added directly to inputs). They also claim that feedback cannot improve word recognition because feedback cannot distinguish signal from noise. We have two goals in this paper. First, we correct the record about the simulations of Magnuson et al. (2018). Second, we explain how interactive activation models selectively sharpen signals via joint effects of feedback and lateral inhibition that boost lexically-coherent sublexical patterns over noise. We also review a growing body of behavioral and neural results consistent with feedback and inconsistent with autonomous (non-feedback) architectures, and conclude that parsimony supports feedback. We close by discussing the potential for synergy between autonomous and interactive approaches.more » « less
-
N/A (Ed.)This study is focused on understanding and quantifying the change in phoneme and prosody information encoded in the Self-Supervised Learning (SSL) model, brought by an accent identification (AID) fine-tuning task. This problem is addressed based on model probing. Specifically, we conduct a systematic layer-wise analysis of the representations of the Transformer layers on a phoneme correlation task, and a novel word-level prosody prediction task. We compare the probing performance of the pre-trained and fine-tuned SSL models. Results show that the AID fine-tuning task steers the top 2 layers to learn richer phoneme and prosody representation. These changes share some similarities with the effects of fine-tuning with an Automatic Speech Recognition task. In addition, we observe strong accent-specific phoneme representations in layer 9. To sum up, this study provides insights into the understanding of SSL features and their interactions with fine-tuning tasks.more » « less
-
Like all domains of cognition, language processing is affected by top–down knowledge. Classic evidence for this is missing blatant errors in the signal. In sentence comprehension, one instance is failing to notice word order errors, such as transposed words in the middle of a sentence: “you that read wrong” (Mirault et al., 2018). Our brains seem to fix such errors, since they are incompatible with our grammatical knowledge, but how do our brains do this? Following behavioral work on inner transpositions, we flashed four-word sentences for 300 ms using rapid parallel visual presentation (Snell and Grainger, 2017). We compared magnetoencephalography responses to fully grammatical and reversed sentences (24 human participants: 21 females, 4 males). The left lateral language cortex robustly distinguished grammatical and reversed sentences starting at 213 ms. Thus, the influence of grammatical knowledge begun rapidly after visual word form recognition (Tarkiainen et al., 1999). At the earliest stage of this neural “sentence superiority effect,” inner transpositions patterned between grammatical and reversed sentences, showing evidence that the brain initially “noticed” the error. However, 100 ms later, inner transpositions became indistinguishable from grammatical sentences, suggesting at this point, the brain had “fixed” the error. These results show that after a single glance at a sentence, syntax impacts our neural activity almost as quickly as higher-level object recognition is assumed to take place (Cichy et al., 2014). The earliest stage involves detailed comparisons between the bottom–up input and grammatical knowledge, while shortly afterward, top–down knowledge can override an error in the stimulus.more » « less
-
We follow up on recent work demonstrating clear advantages of lexical-to-sublexical feedback in the TRACE model of spoken word recognition. The prior work compared accuracy and recognition times in TRACE with feedback on or off as progressively more noise was added to inputs. Recognition times were faster with feedback at every level of noise, and there was an accuracy advantage for feedback with noise added to inputs. However, a recent article claims that those results must be an artifact of converting activations to response probabilities, because feedback could only reinforce the “status quo.” That is, the claim is that given noisy inputs, feedback must reinforce all inputs equally, whether driven by signal or noise. We demonstrate that the feedback advantage replicates with raw activations. We also demonstrate that lexical feedback selectively reinforces lexically-coherent input patterns – that is, signal over noise – and explain how that behavior emerges naturally in interactive activation.more » « less
An official website of the United States government

