skip to main content

Title: Reward-based training of recurrent neural networks for cognitive and value-based tasks
A major goal in neuroscience is to understand the relationship between an animal’s behavior and how this is encoded in the brain. Therefore, a typical experiment involves training an animal to perform a task and recording the activity of its neurons – brain cells – while the animal carries out the task. To complement these experimental results, researchers “train” artificial neural networks – simplified mathematical models of the brain that consist of simple neuron-like units – to simulate the same tasks on a computer. Unlike real brains, artificial neural networks provide complete access to the “neural circuits” responsible for a behavior, offering a way to study and manipulate the behavior in the circuit. One open issue about this approach has been the way in which the artificial networks are trained. In a process known as reinforcement learning, animals learn from rewards (such as juice) that they receive when they choose actions that lead to the successful completion of a task. By contrast, the artificial networks are explicitly told the correct action. In addition to differing from how animals learn, this limits the types of behavior that can be studied using artificial neural networks. Recent advances in the field of machine more » learning that combine reinforcement learning with artificial neural networks have now allowed Song et al. to train artificial networks to perform tasks in a way that mimics the way that animals learn. The networks consisted of two parts: a “decision network” that uses sensory information to select actions that lead to the greatest reward, and a “value network” that predicts how rewarding an action will be. Song et al. found that the resulting artificial “brain activity” closely resembled the activity found in the brains of animals, confirming that this method of training artificial neural networks may be a useful tool for neuroscientists who study the relationship between brains and behavior. The training method explored by Song et al. represents only one step forward in developing artificial neural networks that resemble the real brain. In particular, neural networks modify connections between units in a vastly different way to the methods used by biological brains to alter the connections between neurons. Future work will be needed to bridge this gap. « less
Authors:
; ;
Award ID(s):
1631586
Publication Date:
NSF-PAR ID:
10039996
Journal Name:
eLife
Volume:
6
ISSN:
2050-084X
Sponsoring Org:
National Science Foundation
More Like this
  1. Understanding the intricacies of the brain often requires spotting and tracking specific neurons over time and across different individuals. For instance, scientists may need to precisely monitor the activity of one neuron even as the brain moves and deforms; or they may want to find universal patterns by comparing signals from the same neuron across different individuals. Both tasks require matching which neuron is which in different images and amongst a constellation of cells. This is theoretically possible in certain ‘model’ animals where every single neuron is known and carefully mapped out. Still, it remains challenging: neurons move relative to one another as the animal changes posture, and the position of a cell is also slightly different between individuals. Sophisticated computer algorithms are increasingly used to tackle this problem, but they are far too slow to track neural signals as real-time experiments unfold. To address this issue, Yu et al. designed a new algorithm based on the Transformer, an artificial neural network originally used to spot relationships between words in sentences. To learn relationships between neurons, the algorithm was fed hundreds of thousands of ‘semi-synthetic’ examples of constellations of neurons. Instead of painfully collated actual experimental data, these datasets weremore »created by a simulator based on a few simple measurements. Testing the new algorithm on the tiny worm Caenorhabditis elegans revealed that it was faster and more accurate, finding corresponding neurons in about 10ms. The work by Yu et al. demonstrates the power of using simulations rather than experimental data to train artificial networks. The resulting algorithm can be used immediately to help study how the brain of C. elegans makes decisions or controls movements. Ultimately, this research could allow brain-machine interfaces to be developed.« less
  2. Humans and most animals can learn new tasks without forgetting old ones. However, training artificial neural networks (ANNs) on new tasks typically causes them to forget previously learned tasks. This phenomenon is the result of “catastrophic forgetting,” in which training an ANN disrupts connection weights that were important for solving previous tasks, degrading task performance. Several recent studies have proposed methods to stabilize connection weights of ANNs that are deemed most important for solving a task, which helps alleviate catastrophic forgetting. Here, drawing inspiration from algorithms that are believed to be implemented in vivo, we propose a complementary method: adding a context-dependent gating signal, such that only sparse, mostly nonoverlapping patterns of units are active for any one task. This method is easy to implement, requires little computational overhead, and allows ANNs to maintain high performance across large numbers of sequentially presented tasks, particularly when combined with weight stabilization. We show that this method works for both feedforward and recurrent network architectures, trained using either supervised or reinforcement-based learning. This suggests that using multiple, complementary methods, akin to what is believed to occur in the brain, can be a highly effective strategy to support continual learning.

  3. A new housing development in a familiar neighborhood, a wrong turn that ends up lengthening a Sunday stroll: our internal representation of the world requires constant updating, and we need to be able to associate events separated by long intervals of time to finetune future outcome. This often requires neural connections to be altered. A brain region known as the hippocampus is involved in building and maintaining a map of our environment. However, signals from other brain areas can activate silent neurons in the hippocampus when the body is in a specific location by triggering cellular events called dendritic calcium spikes. Milstein et al. explored whether dendritic calcium spikes in the hippocampus could also help the brain to update its map of the world by enabling neurons to stop being active at one location and to start responding at a new position. Experiments in mice showed that calcium spikes could change which features of the environment individual neurons respond to by strengthening or weaking connections between specific cells. Crucially, this mechanism allowed neurons to associate event sequences that unfold over a longer timescale that was more relevant to the ones encountered in day-to-day life. A computational model was then putmore »together, and it demonstrated that dendritic calcium spikes in the hippocampus could enable the brain to make better spatial decisions in future. Indeed, these spikes are driven by inputs from brain regions involved in complex cognitive processes, potentially enabling the delayed outcomes of navigational choices to guide changes in the activity and wiring of neurons. Overall, the work by Milstein et al. advances the understanding of learning and memory in the brain and may inform the design of better systems for artificial learning.« less
  4. Understanding the neural basis of behavior is a challenging task for technical reasons. Most methods of recording neural activity require animals to be immobilized, but neural activity associated with most behavior cannot be recorded from an anesthetized, immobilized animal. Using amphibians, however, there has been some success in developing in vitro brain preparations that can be used for electrophysiological and anatomical studies. Here, we describe an ex vivo frog brain preparation from which fictive vocalizations (the neural activity that would have produced vocalizations had the brain been attached to the muscle) can be elicited repeatedly. When serotonin is applied to the isolated brains of male and female African clawed frogs, Xenopus laevis, laryngeal nerve activity that is a facsimile of those that underlie sex-specific vocalizations in vivo can be readily recorded. Recently, this preparation was successfully used in other species within the genus including Xenopus tropicalis and Xenopus victorianus. This preparation allows a variety of techniques to be applied including extracellular and intracellular electrophysiological recordings and calcium imaging during vocal production, surgical and pharmacological manipulation of neurons to evaluate their impact on motor output, and tract tracing of the neural circuitry. Thus, the preparation is a powerful tool with whichmore »to understand the basic principles that govern the production of coherent and robust motor programs in vertebrates.« less
  5. Most animals develop from juveniles, which cannot reproduce, to sexually mature adults. The most obvious signs of this transition are changes in body shape and size. However, changes also take place in the brain that enable the animals to adapt their behavior to the demands of adulthood. For example, fully fed adult male roundworms will leave a food source to search for mates, whereas juvenile males will continue feeding. The transition to sexual maturity needs to be carefully timed. Too early, and the animal risks compromising key stages of development. Too late, and the animal may be less competitive in the quest for reproductive success. Cues in the environment, such as the presence of food and mates, interact with timing mechanisms in the brain to trigger sexual maturity. But how these mechanisms work – in particular where and how an animal keeps track of its developmental stage – is not well understood. In the roundworm species Caenorhabditis elegans, waves of gene activity, known collectively as the heterochronic pathway, determine patterns of cell growth as animals mature. Through further studies of these worms, Lawson et al. now show that these waves also control the time at which neural circuits mature. Inmore »addition, the waves of activity occur inside the nervous system itself, rather than in a tissue that sends signals to the nervous system. Moreover, they occur independently inside many different neurons. Each neuron thus has its own molecular clock for keeping track of development. Several of the genes critical for developmental timekeeping in worms are also found in mammals, including two genes that help to control when puberty starts in humans. If one of these genes – called MKRN3 – does not work correctly, it can lead to a condition that causes individuals to go through puberty several years earlier than normal. Studying the mechanisms identified in roundworms may help us to better understand this disorder. More generally, future work that builds on the results presented by Lawson et al. will help to reveal how environmental cues and gene activity interact to control when we become adults.« less