skip to main content


Search for: All records

Award ID contains: 1845322

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. One major way that people engage in adaptive problem solving is by imitating others’ solutions. Prominent simulation models have found imperfect imitation advantageous, but the interactions between copying amount and other prevalent aspects of social learning strategies have been underexplored. Here, we explore the consequences for a group when its members engage in strategies with different degrees of copying, solving search problems of varying complexity, in different network topologies that affect the solutions visible to each member. Using a computational model of collective problem solving, we demonstrate that the advantage of partial copying is robust across these conditions, arising from its ability to maintain diversity. Partial copying delays convergence generally but especially in globally connected networks, which are typically associated with diversity loss, allowing more exploration of a problem space. We show that a moderate amount of diversity maintenance is optimal and strategies can be adjusted to find that sweet spot. 
    more » « less
  2. Lifetime learning, or the change (or acquisition) of behaviors during a lifetime, based on experience, is a hallmark of living organisms. Multiple mechanisms may be involved, but biological neural circuits have repeatedly demonstrated a vital role in the learning process. These neural circuits are recurrent, dynamic, and non-linear and models of neural circuits employed in neuroscience and neuroethology tend to involve, accordingly, continuous-time, non-linear, and recurrently interconnected components. Currently, the main approach for finding configurations of dynamical recurrent neural networks that demonstrate behaviors of interest is using stochastic search techniques, such as evolutionary algorithms. In an evolutionary algorithm, these dynamic recurrent neural networks are evolved to perform the behavior over multiple generations, through selection, inheritance, and mutation, across a population of solutions. Although, these systems can be evolved to exhibit lifetime learning behavior, there are no explicit rules built into these dynamic recurrent neural networks that facilitate learning during their lifetime (e.g., reward signals). In this work, we examine a biologically plausible lifetime learning mechanism for dynamical recurrent neural networks. We focus on a recently proposed reinforcement learning mechanism inspired by neuromodulatory reward signals and ongoing fluctuations in synaptic strengths. Specifically, we extend one of the best-studied and most-commonly used dynamic recurrent neural networks to incorporate the reinforcement learning mechanism. First, we demonstrate that this extended dynamical system (model and learning mechanism) can autonomously learn to perform a central pattern generation task. Second, we compare the robustness and efficiency of the reinforcement learning rules in relation to two baseline models, a random walk and a hill-climbing walk through parameter space. Third, we systematically study the effect of the different meta-parameters of the learning mechanism on the behavioral learning performance. Finally, we report on preliminary results exploring the generality and scalability of this learning mechanism for dynamical neural networks as well as directions for future work. 
    more » « less
  3. Cejkova, Jitka ; Holler, Silvia ; Soros, Lisa ; Witkowski, Olaf (Ed.)
    In order to make lifelike, versatile learning adaptive in the artificial domain, one needs a very diverse set of behaviors to learn. We propose a parameterized distribution of classic control-style tasks with minimal information shared between tasks. We discuss what makes a task trivial and offer a basic metric, time in convergence, that measures triviality. We then investigate analytic and empirical approaches to generating reward structures for tasks based on their dynamics in order to minimize triviality. Contrary to our expectations, populations evolved on reward structures that incentivized the most stable locations in state space spend the least time in convergence as we have defined it, because of the outsized importance our metric assigns to behavior fine-tuning in these contexts. This work paves the way towards an understanding of which task distributions enable the development of learning. 
    more » « less
  4. null (Ed.)
    Multiple mechanisms contribute to the generation, propagation, and coordination of the rhythmic patterns necessary for locomotion in Caenorhabditis elegans . Current experiments have focused on two possibilities: pacemaker neurons and stretch-receptor feedback. Here, we focus on whether it is possible that a chain of multiple network rhythmic pattern generators in the ventral nerve cord also contribute to locomotion. We use a simulation model to search for parameters of the anatomically constrained ventral nerve cord circuit that, when embodied and situated, can drive forward locomotion on agar, in the absence of pacemaker neurons or stretch-receptor feedback. Systematic exploration of the space of possible solutions reveals that there are multiple configurations that result in locomotion that is consistent with certain aspects of the kinematics of worm locomotion on agar. Analysis of the best solutions reveals that gap junctions between different classes of motorneurons in the ventral nerve cord can play key roles in coordinating the multiple rhythmic pattern generators. 
    more » « less
  5. Ahamed, Tosif (Ed.)
    Motile organisms actively detect environmental signals and migrate to a preferable environment. Especially, small animals convert subtle spatial difference in sensory input into orientation behavioral output for directly steering toward a destination, but the neural mechanisms underlying steering behavior remain elusive. Here, we analyze a C . elegans thermotactic behavior in which a small number of neurons are shown to mediate steering toward a destination temperature. We construct a neuroanatomical model and use an evolutionary algorithm to find configurations of the model that reproduce empirical thermotactic behavior. We find that, in all the evolved models, steering curvature are modulated by temporally persistent thermal signals sensed beyond the time scale of sinusoidal locomotion of C . elegans . Persistent rise in temperature decreases steering curvature resulting in straight movement of model worms, whereas fall in temperature increases curvature resulting in crooked movement. This relation between temperature change and steering curvature reproduces the empirical thermotactic migration up thermal gradients and steering bias toward higher temperature. Further, spectrum decomposition of neural activities in model worms show that thermal signals are transmitted from a sensory neuron to motor neurons on the longer time scale than sinusoidal locomotion of C . elegans . Our results suggest that employments of temporally persistent sensory signals enable small animals to steer toward a destination in natural environment with variable, noisy, and subtle cues. 
    more » « less
  6. null (Ed.)
    Abstract Behavior involves the ongoing interaction between an organism and its environment. One of the prevailing theories of adaptive behavior is that organisms are constantly making predictions about their future environmental stimuli. However, how they acquire that predictive information is still poorly understood. Two complementary mechanisms have been proposed: predictions are generated from an agent’s internal model of the world or predictions are extracted directly from the environmental stimulus. In this work, we demonstrate that predictive information, measured using bivariate mutual information, cannot distinguish between these two kinds of systems. Furthermore, we show that predictive information cannot distinguish between organisms that are adapted to their environments and random dynamical systems exposed to the same environment. To understand the role of predictive information in adaptive behavior, we need to be able to identify where it is generated. To do this, we decompose information transfer across the different components of the organism-environment system and track the flow of information in the system over time. To validate the proposed framework, we examined it on a set of computational models of idealized agent-environment systems. Analysis of the systems revealed three key insights. First, predictive information, when sourced from the environment, can be reflected in any agent irrespective of its ability to perform a task. Second, predictive information, when sourced from the nervous system, requires special dynamics acquired during the process of adapting to the environment. Third, the magnitude of predictive information in a system can be different for the same task if the environmental structure changes. 
    more » « less
  7. Living organisms learn on multiple time scales: evolutionary as well as individual-lifetime learning. These two learning modes are complementary: the innate phenotypes developed through evolution significantly influence lifetime learning. However, it is still unclear how these two learning methods interact and whether there is a benefit to part of the system being optimized on a different time scale using a population-based approach while the rest of it is trained on a different time-scale using an individualistic learning algorithm. In this work, we study the benefits of such a hybrid approach using an actor-critic framework where the critic part of an agent is optimized over evolutionary time based on its ability to train the actor part of an agent during its lifetime. Typically, critics are optimized on the same time-scale as the actor using the Bellman equation to represent long-term expected reward. We show that evolution can find a variety of different solutions that can still enable an actor to learn to perform a behavior during its lifetime. We also show that although the solutions found by evolution represent different functions, they all provide similar training signals during the lifetime. This suggests that learning on multiple time-scales can effectively simplify the overall optimization process in the actor-critic framework by finding one of many solutions that can still train an actor just as well. Furthermore, analysis of the evolved critics can yield additional possibilities for reinforcement learning beyond the Bellman equation. 
    more » « less
  8. Artificial Life has a long tradition of studying the interaction between learning and evolution. And, thanks to the increase in the use of individual learning techniques in Artificial Intelligence, there has been a recent revival of work combining individual and evolutionary learning. Despite the breadth of work in this area, the exact trade-offs between these two forms of learning remain unclear. In this work, we systematically examine the effect of task difficulty, the individual learning approach, and the form of inheritance on the performance of the population across different combinations of learning and evolution. We analyze in depth the conditions in which hybrid strategies that combine lifetime and evolutionary learning outperform either lifetime or evolutionary learning in isolation. We also discuss the importance of these results in both a biological and algorithmic context. 
    more » « less