skip to main content


Title: Design of experiments for the calibration of history-dependent models via deep reinforcement learning and an enhanced Kalman filter
Experimental data are often costly to obtain, which makes it difficult to calibrate complex models. For many models an experimental design that produces the best calibration given a limited experimental budget is not obvious. This paper introduces a deep reinforcement learning (RL) algorithm for design of experiments that maximizes the information gain measured by Kullback–Leibler divergence obtained via the Kalman filter (KF). This combination enables experimental design for rapid online experiments where manual trial-and-error is not feasible in the high-dimensional parametric design space. We formulate possible configurations of experiments as a decision tree and a Markov decision process, where a finite choice of actions is available at each incremental step. Once an action is taken, a variety of measurements are used to update the state of the experiment. This new data leads to a Bayesian update of the parameters by the KF, which is used to enhance the state representation. In contrast to the Nash–Sutcliffe efficiency index, which requires additional sampling to test hypotheses for forward predictions, the KF can lower the cost of experiments by directly estimating the values of new data acquired through additional actions. In this work our applications focus on mechanical testing of materials. Numerical experiments with complex, history-dependent models are used to verify the implementation and benchmark the performance of the RL-designed experiments.  more » « less
Award ID(s):
1846875
NSF-PAR ID:
10487109
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
Computational Mechanics
Date Published:
Journal Name:
Computational Mechanics
Volume:
72
Issue:
1
ISSN:
0178-7675
Page Range / eLocation ID:
95 to 124
Subject(s) / Keyword(s):
Experimental design Deep reinforcement learning Enhanced Kalman filter · Elastoplasticity
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    In anagram games, players are provided with letters for forming as many words as possible over a specified time duration. Anagram games have been used in controlled experiments to study problems such as collective identity, effects of goal setting, internal-external attributions, test anxiety, and others. The majority of work on anagram games involves individual players. Recently, work has expanded to group anagram games where players cooperate by sharing letters. In this work, we analyze experimental data from online social networked experiments of group anagram games. We develop mechanistic and data driven models of human decision-making to predict detailed game player actions (e.g., what word to form next). With these results, we develop a composite agent-based modeling and simulation platform that incorporates the models from data analysis. We compare model predictions against experimental data, which enables us to provide explanations of human decision-making and behavior. Finally, we provide illustrative case studies using agent-based simulations to demonstrate the efficacy of models to provide insights that are beyond those from experiments alone. 
    more » « less
  2. null (Ed.)
    Interactive reinforcement learning (IRL) agents use human feedback or instruction to help them learn in complex environments. Often, this feedback comes in the form of a discrete signal that’s either positive or negative. While informative, this information can be difficult to generalize on its own. In this work, we explore how natural language advice can be used to provide a richer feedback signal to a reinforcement learning agent by extending policy shaping, a well-known IRL technique. Usually policy shaping employs a human feedback policy to help an agent to learn more about how to achieve its goal. In our case, we replace this human feedback policy with policy generated based on natural language advice. We aim to inspect if the generated natural language reasoning provides support to a deep RL agent to decide its actions successfully in any given environment. So, we design our model with three networks: first one is the experience driven, next is the advice generator and third one is the advice driven. While the experience driven RL agent chooses its actions being influenced by the environmental reward, the advice driven neural network with generated feedback by the advice generator for any new state selects its actions to assist the RL agent to better policy shaping. 
    more » « less
  3. null (Ed.)
    We introduce a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem. This allows us to draw upon the simplicity and scalability of the Transformer architecture, and associated advances in language modeling such as GPT-x and BERT. In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling. Unlike prior approaches to RL that fit value functions or compute policy gradients, Decision Transformer simply outputs the optimal actions by leveraging a causally masked Transformer. By conditioning an autoregressive model on the desired return (reward), past states, and actions, our Decision Transformer model can generate future actions that achieve the desired return. Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art model-free offline RL baselines on Atari, OpenAI Gym, and Key-to-Door tasks. 
    more » « less
  4. Abstract

    The Kölliker–Fuse nucleus (KF), which is part of the parabrachial complex, participates in the generation of eupnoea under resting conditions and the control of active abdominal expiration when increased ventilation is required. Moreover, dysfunctions in KF neuronal activity are believed to play a role in the emergence of respiratory abnormalities seen in Rett syndrome (RTT), a progressive neurodevelopmental disorder associated with an irregular breathing pattern and frequent apnoeas. Relatively little is known, however, about the intrinsic dynamics of neurons within the KF and how their synaptic connections affect breathing pattern control and contribute to breathing irregularities. In this study, we use a reduced computational model to consider several dynamical regimes of KF activity paired with different input sources to determine which combinations are compatible with known experimental observations. We further build on these findings to identify possible interactions between the KF and other components of the respiratory neural circuitry. Specifically, we present two models that both simulate eupnoeic as well as RTT‐like breathing phenotypes. Using nullcline analysis, we identify the types of inhibitory inputs to the KF leading to RTT‐like respiratory patterns and suggest possible KF local circuit organizations. When the identified properties are present, the two models also exhibit quantal acceleration of late‐expiratory activity, a hallmark of active expiration featuring forced exhalation, with increasing inhibition to KF, as reported experimentally. Hence, these models instantiate plausible hypotheses about possible KF dynamics and forms of local network interactions, thus providing a general framework as well as specific predictions for future experimental testing.image

    Key points

    The Kölliker–Fuse nucleus (KF), a part of the parabrachial complex, is involved in regulating normal breathing and controlling active abdominal expiration during increased ventilation.

    Dysfunction in KF neuronal activity is thought to contribute to respiratory abnormalities seen in Rett syndrome (RTT). This study utilizes computational modelling to explore different dynamical regimes of KF activity and their compatibility with experimental observations.

    By analysing different model configurations, the study identifies inhibitory inputs to the KF that lead to RTT‐like respiratory patterns and proposes potential KF local circuit organizations.

    Two models are presented that simulate both normal breathing and RTT‐like breathing patterns.

    These models provide testable hypotheses and specific predictions for future experimental investigations, offering a general framework for understanding KF dynamics and potential network interactions.

     
    more » « less
  5. The action language m∗ employs the notion of update models in defining transitions between states. Given an action occurrence and a state, the update model of the action occurrence is automatically constructed from the given state and the observability of agents. A main criticism of this approach is that it cannot deal with situations when agents’ have incorrect beliefs about the observability of other agents. The present paper addresses this shortcoming by defining a new semantics for m∗ . The new semantics addresses the aforementioned problem of m∗ while maintaining the simplicity of its semantics; the new definitions continue to employ simple update models, with at most three events for all types of actions, which can be constructed given the action specification and independently from the state in which the action occurs. 
    more » « less