skip to main content


Title: Learning to Assimilate in Chaotic Dynamical Systems
The accuracy of simulation-based forecasting in chaotic systems is heavily dependent on high-quality estimates of the system state at the beginning of the forecast. Data assimilation methods are used to infer these initial conditions by systematically combining noisy, incomplete observations and numerical models of system dynamics to produce highly effective estimation schemes. We introduce a self-supervised framework, which we call \textit{amortized assimilation}, for learning to assimilate in dynamical systems. Amortized assimilation combines deep learning-based denoising with differentiable simulation, using independent neural networks to assimilate specific observation types while connecting the gradient flow between these sub-tasks with differentiable simulation and shared recurrent memory. This hybrid architecture admits a self-supervised training objective which is minimized by an unbiased estimator of the true system state even in the presence of only noisy training data. Numerical experiments across several chaotic benchmark systems highlight the improved effectiveness of our approach compared to widely-used data assimilation methods.  more » « less
Award ID(s):
1835825
NSF-PAR ID:
10443541
Author(s) / Creator(s):
;
Editor(s):
Ranzato, M.; Beygelzimer, A.; Dauphin, Y.; Liang, P.S.; Vaughan, J. Wortman
Date Published:
Journal Name:
Advances in neural information processing systems
Volume:
34
ISSN:
1049-5258
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The development of data-informed predictive models for dynamical systems is of widespread interest in many disciplines. We present a unifying framework for blending mechanistic and machine-learning approaches to identify dynamical systems from noisily and partially observed data. We compare pure data-driven learning with hybrid models which incorporate imperfect domain knowledge, referring to the discrepancy between an assumed truth model and the imperfect mechanistic model as model error. Our formulation is agnostic to the chosen machine learning model, is presented in both continuous- and discrete-time settings, and is compatible both with model errors that exhibit substantial memory and errors that are memoryless. First, we study memoryless linear (w.r.t. parametric-dependence) model error from a learning theory perspective, defining excess risk and generalization error. For ergodic continuous-time systems, we prove that both excess risk and generalization error are bounded above by terms that diminish with the square-root of T T , the time-interval over which training data is specified. Secondly, we study scenarios that benefit from modeling with memory, proving universal approximation theorems for two classes of continuous-time recurrent neural networks (RNNs): both can learn memory-dependent model error, assuming that it is governed by a finite-dimensional hidden variable and that, together, the observed and hidden variables form a continuous-time Markovian system. In addition, we connect one class of RNNs to reservoir computing, thereby relating learning of memory-dependent error to recent work on supervised learning between Banach spaces using random features. Numerical results are presented (Lorenz ’63, Lorenz ’96 Multiscale systems) to compare purely data-driven and hybrid approaches, finding hybrid methods less datahungry and more parametrically efficient. We also find that, while a continuous-time framing allows for robustness to irregular sampling and desirable domain- interpretability, a discrete-time framing can provide similar or better predictive performance, especially when data are undersampled and the vector field defining the true dynamics cannot be identified. Finally, we demonstrate numerically how data assimilation can be leveraged to learn hidden dynamics from noisy, partially-observed data, and illustrate challenges in representing memory by this approach, and in the training of such models. 
    more » « less
  2. Models of many engineering and natural systems are imperfect. The discrepancy between the mathematical representations of a true physical system and its imperfect model is called the model error. These model errors can lead to substantial differences between the numerical solutions of the model and the state of the system, particularly in those involving nonlinear, multi-scale phenomena. Thus, there is increasing interest in reducing model errors, particularly by leveraging the rapidly growing observational data to understand their physics and sources. Here, we introduce a framework named MEDIDA: Model Error Discovery with Interpretability and Data Assimilation. MEDIDA only requires a working numerical solver of the model and a small number of noise-free or noisy sporadic observations of the system. In MEDIDA, first, the model error is estimated from differences between the observed states and model-predicted states (the latter are obtained from a number of one-time-step numerical integrations from the previous observed states). If observations are noisy, a data assimilation technique, such as the ensemble Kalman filter, is employed to provide the analysis state of the system, which is then used to estimate the model error. Finally, an equation-discovery technique, here the relevance vector machine, a sparsity-promoting Bayesian method, is used to identify an interpretable, parsimonious, and closed-form representation of the model error. Using the chaotic Kuramoto–Sivashinsky system as the test case, we demonstrate the excellent performance of MEDIDA in discovering different types of structural/parametric model errors, representing different types of missing physics, using noise-free and noisy observations.

     
    more » « less
  3. null (Ed.)
    High-throughput phenotyping enables the efficient collection of plant trait data at scale. One example involves using imaging systems over key phases of a crop growing season. Although the resulting images provide rich data for statistical analyses of plant phenotypes, image processing for trait extraction is required as a prerequisite. Current methods for trait extraction are mainly based on supervised learning with human labeled data or semisupervised learning with a mixture of human labeled data and unsupervised data. Unfortunately, preparing a sufficiently large training data is both time and labor-intensive. We describe a self-supervised pipeline (KAT4IA) that uses K -means clustering on greenhouse images to construct training data for extracting and analyzing plant traits from an image-based field phenotyping system. The KAT4IA pipeline includes these main steps: self-supervised training set construction, plant segmentation from images of field-grown plants, automatic separation of target plants, calculation of plant traits, and functional curve fitting of the extracted traits. To deal with the challenge of separating target plants from noisy backgrounds in field images, we describe a novel approach using row-cuts and column-cuts on images segmented by transform domain neural network learning, which utilizes plant pixels identified from greenhouse images to train a segmentation model for field images. This approach is efficient and does not require human intervention. Our results show that KAT4IA is able to accurately extract plant pixels and estimate plant heights. 
    more » « less
  4. Training self-driving systems to be robust to the long-tail of driving scenarios is a critical problem. Model-based approaches leverage simulation to emulate a wide range of scenarios without putting users at risk in the real world. One promising path to faithful simulation is to train a forward model of the world to predict the future states of both the environment and the ego-vehicle given past states and a sequence of actions. In this paper, we argue that it is beneficial to model the state of the ego-vehicle, which often has simple, predictable and deterministic behavior, separately from the rest of the environment, which is much more complex and highly multimodal. We propose to model the ego-vehicle using a simple and differentiable kinematic model, while training a stochastic convolutional forward model on raster representations of the state to predict the behavior of the rest of the environment. We explore several configurations of such decoupled models, and evaluate their performance both with Model Predictive Control (MPC) and direct policy learning. We test our methods on the task of highway driving and demonstrate lower crash rates and better stability. The code is available at https://github.com/vladisai/pytorch-PPUU/tree/ICLR2022. 
    more » « less
  5. Summary

    A new approach has been developed for combining and enhancing the results from an existing computational fluid dynamics model with experimental data using the weighted least‐squares finite element method (WLSFEM). Development of the approach was motivated by the existence of both limited experimental blood velocity in the left ventricle and inexact numerical models of the same flow. Limitations of the experimental data include measurement noise and having data only along a two‐dimensional plane. Most numerical modeling approaches do not provide the flexibility to assimilate noisy experimental data. We previously developed an approach that could assimilate experimental data into the process of numerically solving the Navier–Stokes equations, but the approach was limited because it required the use of specific finite element methods for solving all model equations and did not support alternative numerical approximation methods. The new approach presented here allows virtually any numerical method to be used for approximately solving the Navier–Stokes equations, and then the WLSFEM is used to combine the experimental data with the numerical solution of the model equations in a final step. The approach dynamically adjusts the influence of the experimental data on the numerical solution so that more accurate data are more closely matched by the final solution and less accurate data are not closely matched. The new approach is demonstrated on different test problems and provides significantly reduced computational costs compared with many previous methods for data assimilation. Copyright © 2016 John Wiley & Sons, Ltd.

     
    more » « less