skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 8:00 PM ET on Friday, March 21 until 8:00 AM ET on Saturday, March 22 due to maintenance. We apologize for the inconvenience.


Title: Lowered inter-stimulus discriminability hurts incremental contributions to learning
Abstract

How does the similarity between stimuli affect our ability to learn appropriate response associations for them? In typical laboratory experiments learning is investigated under somewhat ideal circumstances, where stimuli are easily discriminable. This is not representative of most real-life learning, where overlapping “stimuli” can result in different “rewards” and may be learned simultaneously (e.g., you may learn over repeated interactions that a specific dog is friendly, but that a very similar looking one isn’t). With two experiments, we test how humans learn in three stimulus conditions: one “best case” condition in which stimuli have idealized and highly discriminable visual and semantic representations, and two in which stimuli have overlapping representations, making them less discriminable. We find that, unsurprisingly, decreasing stimuli discriminability decreases performance. We develop computational models to test different hypotheses about how reinforcement learning (RL) and working memory (WM) processes are affected by different stimulus conditions. Our results replicate earlier studies demonstrating the importance of both processes to capture behavior. However, our results extend previous studies by demonstrating that RL, and not WM, is affected by stimulus distinctness: people learn slower and have higher across-stimulus value confusion at decision when stimuli are more similar to each other. These results illustrate strong effects of stimulus type on learning and demonstrate the importance of considering parallel contributions of different cognitive processes when studying behavior.

 
more » « less
PAR ID:
10452166
Author(s) / Creator(s):
; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
Cognitive, Affective, & Behavioral Neuroscience
ISSN:
1530-7026
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Human learning and decision-making are supported by multiple systems operating in parallel. Recent studies isolating the contributions of reinforcement learning (RL) and working memory (WM) have revealed a trade-off between the two. An interactive WM/RL computational model predicts that although high WM load slows behavioral acquisition, it also induces larger prediction errors in the RL system that enhance robustness and retention of learned behaviors. Here, we tested this account by parametrically manipulating WM load during RL in conjunction with EEG in both male and female participants and administered two surprise memory tests. We further leveraged single-trial decoding of EEG signatures of RL and WM to determine whether their interaction predicted robust retention. Consistent with the model, behavioral learning was slower for associations acquired under higher load but showed parametrically improved future retention. This paradoxical result was mirrored by EEG indices of RL, which were strengthened under higher WM loads and predictive of more robust future behavioral retention of learned stimulus–response contingencies. We further tested whether stress alters the ability to shift between the two systems strategically to maximize immediate learning versus retention of information and found that induced stress had only a limited effect on this trade-off. The present results offer a deeper understanding of the cooperative interaction between WM and RL and show that relying on WM can benefit the rapid acquisition of choice behavior during learning but impairs retention. SIGNIFICANCE STATEMENT Successful learning is achieved by the joint contribution of the dopaminergic RL system and WM. The cooperative WM/RL model was productive in improving our understanding of the interplay between the two systems during learning, demonstrating that reliance on RL computations is modulated by WM load. However, the role of WM/RL systems in the retention of learned stimulus–response associations remained unestablished. Our results show that increased neural signatures of learning, indicative of greater RL computation, under high WM load also predicted better stimulus–response retention. This result supports a trade-off between the two systems, where degraded WM increases RL processing, which improves retention. Notably, we show that this cooperative interplay remains largely unaffected by acute stress. 
    more » « less
  2. Abstract

    The Reward‐Positivity (RewP) is a frontocentral event‐related potential elicited following reward and punishment feedback. Reinforcement learning theories propose the RewP reflects a reward prediction error that increases following more favorable (vs. unfavorable) outcomes. An alternative perspective, however, proposes this component indexes a salience‐prediction error that increases following more salient outcomes. Evidence from prior studies that included both reward and punishment conditions is mixed, supporting both accounts. However, these studies often varied how feedback stimuli were repeated across reward and punishment conditions. Differences in the frequency of feedback stimuli may drive inconsistencies by introducing salience effects for infrequent stimuli regardless of whether they are associated with rewards or punishments. To test this hypothesis, the current study examined the effect of outcome valence and stimulus frequency on the RewP and neighboring P2 and P3 components in reward, punishment, and neutral contexts across two separate experiments that varied how often feedback stimuli were repeated between conditions. Experiment 1 revealed infrequent feedback stimuli generated overlapping positivity across all three components. However, controlling for stimulus frequency, experiment 2 revealed favorable outcomes that increased RewP and P3 positivity. Together, these results suggest the RewP reflects some combination of reward‐ and salience‐prediction error encoding. Results also indicate infrequent feedback stimuli elicited strong salience effects across all three components that may inflate, eliminate, or reverse outcome valence effects for the RewP and P3. These results resolve several inconsistencies in the literature and have important implications for electrocortical investigations of reward and punishment feedback processing.

     
    more » « less
  3. Abstract In reinforcement learning (RL) experiments, participants learn to make rewarding choices in response to different stimuli; RL models use outcomes to estimate stimulus–response values that change incrementally. RL models consider any response type indiscriminately, ranging from more concretely defined motor choices (pressing a key with the index finger), to more general choices that can be executed in a number of ways (selecting dinner at the restaurant). However, does the learning process vary as a function of the choice type? In Experiment 1, we show that it does: Participants were slower and less accurate in learning correct choices of a general format compared with learning more concrete motor actions. Using computational modeling, we show that two mechanisms contribute to this. First, there was evidence of irrelevant credit assignment: The values of motor actions interfered with the values of other choice dimensions, resulting in more incorrect choices when the correct response was not defined by a single motor action; second, information integration for relevant general choices was slower. In Experiment 2, we replicated and further extended the findings from Experiment 1 by showing that slowed learning was attributable to weaker working memory use, rather than slowed RL. In both experiments, we ruled out the explanation that the difference in performance between two condition types was driven by difficulty/different levels of complexity. We conclude that defining a more abstract choice space used by multiple learning systems for credit assignment recruits executive resources, limiting how much such processes then contribute to fast learning. 
    more » « less
  4. Abstract

    The current study aimed to elucidate the contributions of the subcortical basal ganglia to human language by adopting the view that these structures engage in a basic neurocomputation that may account for its involvement across a wide range of linguistic phenomena. Specifically, we tested the hypothesis that basal ganglia reinforcement learning (RL) mechanisms may account for variability in semantic selection processes necessary for ambiguity resolution. To test this, we used a biased homograph lexical ambiguity priming task that allowed us to measure automatic processes for resolving ambiguity toward high‐frequency word meanings. Individual differences in task performance were then related to indices of basal ganglia RL, which were used to group subjects into three learning styles: (a) Choosers who learn by seeking high reward probability stimuli; (b) Avoiders, who learn by avoiding low reward probability stimuli; and (c) Balanced participants, whose learning reflects equal contributions of choose and avoid processes. The results suggest that balanced individuals had significantly lower access to subordinate, or low‐frequency, homograph word meanings. Choosers and Avoiders, on the other hand, had higher access to the subordinate word meaning even after a long delay between prime and target. Experimental findings were then tested using an ACT‐R computational model of RL that learns from both positive and negative feedback. Results from the computational model simulations confirm and extend the pattern of behavioral findings, providing an RL account of individual differences in lexical ambiguity resolution.

     
    more » « less
  5. Abstract

    The brain is organized such that it encodes and maintains category information about thousands of objects. However, how learning shapes these neural representations of object categories is unknown. The present study focuses on faces, examining whether: (1) Enhanced categorical discrimination or (2) Feature analysis enhances face/non‐face categorization in the brain. Stimuli ranged from non‐faces to faces with two‐toned Mooney images used for testing and gray‐scale images used for training. The stimulus set was specifically chosen because it has a true categorical boundary between faces and non‐faces but the stimuli surrounding that boundary have very similar features, making the boundary harder to learn. Brain responses were measured using functional magnetic resonance imaging while participants categorized the stimuli before and after training. Participants were either trained with a categorization task, or with non‐categorical semblance analyzation. Interestingly, when participants were categorically trained, the neural activity pattern in the left fusiform gyrus shifted from a graded representation of the stimuli to a categorical representation. This corresponded with categorical face/non‐face discrimination, critically including both an increase in selectivity to faces and a decrease in false alarm response to non‐faces. By contrast, while activity pattern in the right fusiform cortex correlated with face/non‐face categorization prior to training, it was not affected by learning. Our results reveal the key role of the left fusiform cortex in learning face categorization. Given the known right hemisphere dominance for face‐selective responses, our results suggest a rethink of the relationship between the two hemispheres in face/non‐face categorization.Hum Brain Mapp 38:3648–3658, 2017. ©2017 Wiley Periodicals, Inc.

     
    more » « less