NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Lexical Feedback in the Time-Invariant String Kernel (TISK) Model of Spoken Word Recognition

https://doi.org/10.5334/joc.362

Magnuson, James S; You, Heejo; Hannagan, Thomas (January 2024, Journal of Cognition)

The Time-Invariant String Kernel (TISK) model of spoken word recognition (Hannagan, Magnuson & Grainger, 2013; You & Magnuson, 2018) is an interactive activation model with many similarities to TRACE (McClelland & Elman, 1986). However, by replacing most time-specific nodes in TRACE with time-invariant open-diphone nodes, TISK uses orders of magnitude fewer nodes and connections than TRACE. Although TISK performed remarkably similarly to TRACE in simulations reported by Hannagan et al., the original TISK implementation did not include lexical feedback, precluding simulation of top-down effects, and leaving open the possibility that adding feedback to TISK might fundamentally alter its performance. Here, we demonstrate that when lexical feedback is added to TISK, it gains the ability to simulate top-down effects without losing the ability to simulate the fundamental phenomena tested by Hannagan et al. Furthermore, with feedback, TISK demonstrates graceful degradation when noise is added to input, although parameters can be found that also promote (less) graceful degradation without feedback. We review arguments for and against feedback in cognitive architectures, and conclude that feedback provides a computationally efficient basis for robust constraint-based processing.
more » « less
Full Text Available
Does signal reduction imply predictive coding in models of spoken word recognition?

https://doi.org/10.3758/s13423-021-01924-x

Luthra, Sahil; Li, Monica Y.; You, Heejo; Brodbeck, Christian; Magnuson, James S. (January 2021, Psychonomic Bulletin & Review)
null (Ed.)
Abstract Pervasive behavioral and neural evidence for predictive processing has led to claims that language processing depends upon predictive coding. Formally, predictive coding is a computational mechanism where only deviations from top-down expectations are passed between levels of representation. In many cognitive neuroscience studies, a reduction of signal for expected inputs is taken as being diagnostic of predictive coding. In the present work, we show that despite not explicitly implementing prediction, the TRACE model of speech perception exhibits this putative hallmark of predictive coding, with reductions in total lexical activation, total lexical feedback, and total phoneme activation when the input conforms to expectations. These findings may indicate that interactive activation is functionally equivalent or approximant to predictive coding or that caution is warranted in interpreting neural signal reduction as diagnostic of predictive coding.
more » « less
Full Text Available
Friends in Low‐Entropy Places: Orthographic Neighbor Effects on Visual Word Identification Differ Across Letter Positions

https://doi.org/10.1111/cogs.12917

Luthra, Sahil; You, Heejo; Rueckl, Jay G.; Magnuson, James S. (December 2020, Cognitive Science)

Abstract Visual word recognition is facilitated by the presence oforthographic neighborsthat mismatch the target word by a single letter substitution. However, researchers typically do not considerwhereneighbors mismatch the target. In light of evidence that some letter positions are more informative than others, we investigate whether the influence of orthographic neighbors differs across letter positions. To do so, we quantify the number ofenemiesat each letter position (how many neighbors mismatch the target word at that position). Analyses of reaction time data from a visual word naming task indicate that the influence of enemies differs across letter positions, with the negative impacts of enemies being most pronounced at letter positions where readers have low prior uncertainty about which letters they will encounter (i.e., positions with low entropy). To understand the computational mechanisms that give rise to such positional entropy effects, we introduce a new computational model, VOISeR (Visual Orthographic Input Serial Reader), which receives orthographic inputs in parallel and produces an over‐time sequence of phonemes as output. VOISeR produces a similar pattern of results as in the human data, suggesting that positional entropy effects may emerge even when letters are not sampled serially. Finally, we demonstrate that these effects also emerge in human subjects' data from a lexical decision task, illustrating the generalizability of positional entropy effects across visual word recognition paradigms. Taken together, such work suggests that research into orthographic neighbor effects in visual word recognition should also consider differences between letter positions.
more » « less
EARSHOT: A Minimal Neural Network Model of Incremental Human Speech Recognition

https://doi.org/10.1111/cogs.12823

Magnuson, James S.; You, Heejo; Luthra, Sahil; Li, Monica; Nam, Hosung; Escabí, Monty; Brown, Kevin; Allopenna, Paul D.; Theodore, Rachel M.; Monto, Nicholas; et al (April 2020, Cognitive Science)

Abstract Despite thelack of invariance problem(the many‐to‐many mapping between acoustics and percepts), human listeners experience phonetic constancy and typically perceive what a speaker intends. Most models of human speech recognition (HSR) have side‐stepped this problem, working with abstract, idealized inputs and deferring the challenge of working with real speech. In contrast, carefully engineered deep learning networks allow robust, real‐world automatic speech recognition (ASR). However, the complexities of deep learning architectures and training regimens make it difficult to use them to provide direct insights into mechanisms that may support HSR. In this brief article, we report preliminary results from a two‐layer network that borrows one element from ASR,long short‐term memorynodes, which provide dynamic memory for a range of temporal spans. This allows the model to learn to map real speech from multiple talkers to semantic targets with high accuracy, with human‐like timecourse of lexical access and phonological competition. Internal representations emerge that resemble phonetically organized responses in human superior temporal gyrus, suggesting that the model develops a distributed phonological code despite no explicit training on phonetic or phonemic targets. The ability to work with real speech is a major advance for cognitive models of HSR.
more » « less

Search for: All records