skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Computational Model for Latent Learning based on Hippocampal Replay
We show how hippocampal replay could explain latent learning, a phenomenon observed in animals where unrewarded pre-exposure to an environment, i.e. habituation, improves task learning rates once rewarded trials begin. We first describe a computational model for spatial navigation inspired by rat studies. The model exploits offline replay of trajectories previously learned by applying reinforcement learning. Then, to assess our hypothesis, the model is evaluated in a “multiple T-maze” environment where rats need to learn a path from the start of the maze to the goal. Simulation results support our hypothesis that pre-exposed or habituated rats learn the task significantly faster than non-pre-exposed rats. Results also show that this effect increases with the number of pre-exposed trials.  more » « less
Award ID(s):
1703225
PAR ID:
10287562
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2020 International Joint Conference on Neural Networks (IJCNN)
Page Range / eLocation ID:
1 to 8
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract The value of the environment determines animals’ motivational states and sets expectations for error-based learning1–3. How are values computed? Reinforcement learning systems can store or cache values of states or actions that are learned from experience, or they can compute values using a model of the environment to simulate possible futures3. These value computations have distinct trade-offs, and a central question is how neural systems decide which computations to use or whether/how to combine them4–8. Here we show that rats use distinct value computations for sequential decisions within single trials. We used high-throughput training to collect statistically powerful datasets from 291 rats performing a temporal wagering task with hidden reward states. Rats adjusted how quickly they initiated trials and how long they waited for rewards across states, balancing effort and time costs against expected rewards. Statistical modeling revealed that animals computed the value of the environment differently when initiating trials versus when deciding how long to wait for rewards, even though these decisions were only seconds apart. Moreover, value estimates interacted via a dynamic learning rate. Our results reveal how distinct value computations interact on rapid timescales, and demonstrate the power of using high-throughput training to understand rich, cognitive behaviors. 
    more » « less
  2. Weitzenfeld, A (Ed.)
    In the last decade, studies have demonstrated that hippocampal place cells influence rats’ navigational learning ability. Moreover, researchers have observed that place cell sequences associated with routes leading to a reward are reactivated during rest periods. This phenomenon is known as Hippocampal Replay, which is thought to aid navigational learning and memory consolidation. These findings in neuroscience have inspired new robot navigation models that emulate the learning process of mammals. This study presents a novel model that encodes path information using place cell connections formed during online navigation. Our model employs these connections to generate sequences of state-action pairs to train our actor-critic reinforcement learning model offline. Our results indicate that our method can accelerate the learning process of solving an open-world navigational task. Specifically, we demonstrate that our approach can learn optimal paths through open-field mazes with obstacles. 
    more » « less
  3. Vivid episodic memories in humans have been described as the replay of the flow of past events in sequential order. Recently, Panoz-Brown et al. (2018) developed an olfactory memory task in which rats were presented with a list of trial-unique odors in an encoding context; next, in a distinctive memory assessment context, the rats were rewarded for choosing the second to last item from the list while avoiding other items from the list. In a different memory assessment context, the fourth to last item was rewarded. According to the episodic memory replay hypothesis, the rat remembers the list items and searches these items to find the item at the targeted locations in the list. However, events presented sequentially differ in memory trace strength, allowing a rat to use the relative familiarity of the memory traces, instead of episodic memory replay, to solve the task. Here, we directly manipulated memory trace strength by manipulating the odor intensity of target odors in both the list presentation and memory assessment. The rats relied on episodic memory replay to solve the memory assessment in conditions in which reliance on memory trace strength is ruled out. We conclude that rats are able to replay episodic memories. 
    more » « less
  4. Abstract Although events are not always known to be important when they occur, people can remember details about such incidentally encoded information using episodic memory. Sheridan et al. (2024) argued that rats replayed episodic memories of incidentally encoded information in an unexpected assessment of memory. In one task, rats reported the third-last item in an explicitly encoded list of trial-unique odors. In a second task, rats foraged in a radial maze in the absence of odors. On a critical test, rats foraged in the maze, but scented lids covered the food. Next, memory of the third-last odor was assessed. The rats correctly answered the unexpected question. Because the odors used in the critical test were the same as those used during training, automatically encoding odors for the purpose of taking an upcoming test of memory (stimulus generalization) may have been encouraged. Here, we provided an opportunity for incidental encoding of novel odors. Previously trained rats foraged in the radial maze with entirely novel odors covering the food. Next, memory of the third-last odor was assessed. The rats correctly answered the unexpected question. High accuracy when confronted with novel odors provides evidence that the rats did not automatically encode odors for the purpose of taking an upcoming test, ruling out stimulus generalization. We conclude that rats encode multiple pieces of putatively unimportant information, and later replayed a stream of novel episodic memories when that information was needed to solve an unexpected problem. 
    more » « less
  5. Introduction: Impulsivity is a symptom of Attention-Deficit/Hyperactivity Disorder (ADHD) and variants in the Lphn3 (Adgrl3) gene [OMIM 616417] have been linked to ADHD. This project utilized a delay-discounting (DD) task to examine the impact of Lphn3 deletion in rats on impulsive choice. “Positive control” measures were also collected in Spontaneously Hypertensive Rats (SHRs), another animal model of ADHD. Methods: For Experiment I, rats were given the option to press one lever for a delayed reward of 3 food pellets or the other lever for an immediate reward of 1 pellet. Impulsive choice was measured as the tendency to discount the larger, delayed reward. We hypothesized that impulsive choice would be greater in the SHR and Lphn3 knockout (KO) rats relative to their control strains - Wistar-Kyoto (WKY) and Lphn3 wildtype (WT) rats, respectively. Results: The results did not completely support the hypothesis, as only the SHRs (but not the Lphn3 KO rats) demonstrated a decrease in the percent choice for the larger reward. Because subsequent trials did not begin until the end of the delay period regardless of which lever was selected, rats were required to wait for the next trial to start even if they picked the immediate lever. Experiment II examined whether the rate of reinforcement influenced impulsive choice by using a DD task that incorporated a 1 sec inter-trial interval (ITI) immediately after delivery of either the immediate (1 pellet) or delayed (3 pellet) reinforcer. The results of Experiment II found no difference in the percent choice for the larger reward between Lphn3 KO and WT rats, demonstrating reinforcement rate did not influence impulsive choice in Lphn3 KO rats. Discussion: Overall, there were impulsivity differences among the ADHD models, as SHRs exhibited deficits in impulsive choice, while the Lphn3 KO rats did not. 
    more » « less