skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Discovery of interpretable structural model errors by combining Bayesian sparse regression and data assimilation: A chaotic Kuramoto–Sivashinsky test case
Models of many engineering and natural systems are imperfect. The discrepancy between the mathematical representations of a true physical system and its imperfect model is called the model error. These model errors can lead to substantial differences between the numerical solutions of the model and the state of the system, particularly in those involving nonlinear, multi-scale phenomena. Thus, there is increasing interest in reducing model errors, particularly by leveraging the rapidly growing observational data to understand their physics and sources. Here, we introduce a framework named MEDIDA: Model Error Discovery with Interpretability and Data Assimilation. MEDIDA only requires a working numerical solver of the model and a small number of noise-free or noisy sporadic observations of the system. In MEDIDA, first, the model error is estimated from differences between the observed states and model-predicted states (the latter are obtained from a number of one-time-step numerical integrations from the previous observed states). If observations are noisy, a data assimilation technique, such as the ensemble Kalman filter, is employed to provide the analysis state of the system, which is then used to estimate the model error. Finally, an equation-discovery technique, here the relevance vector machine, a sparsity-promoting Bayesian method, is used to identify an interpretable, parsimonious, and closed-form representation of the model error. Using the chaotic Kuramoto–Sivashinsky system as the test case, we demonstrate the excellent performance of MEDIDA in discovering different types of structural/parametric model errors, representing different types of missing physics, using noise-free and noisy observations.  more » « less
Award ID(s):
2005123
PAR ID:
10367968
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
American Institute of Physics
Date Published:
Journal Name:
Chaos: An Interdisciplinary Journal of Nonlinear Science
Volume:
32
Issue:
6
ISSN:
1054-1500
Page Range / eLocation ID:
Article No. 061105
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Weather forecasts made with imperfect models contain state‐dependent errors. Data assimilation (DA) partially corrects these errors with new information from observations. As such, the corrections, or “analysis increments,” produced by the DA process embed information about model errors. An attempt is made here to extract that information to improve numerical weather prediction. Neural networks (NNs) are trained to predict corrections to the systematic error in the National Oceanic and Atmospheric Administration's FV3‐GFS model based on a large set of analysis increments. A simple NN focusing on an atmospheric column significantly improves the estimated model error correction relative to a linear baseline. Leveraging large‐scale horizontal flow conditions using a convolutional NN, when compared to the simple column‐oriented NN, does not improve skill in correcting model error. The sensitivity of model error correction to forecast inputs is highly localized by vertical level and by meteorological variable, and the error characteristics vary across vertical levels. Once trained, the NNs are used to apply an online correction to the forecast during model integration. Improvements are evaluated both within a cycled DA system and across a collection of 10‐day forecasts. It is found that applying state‐dependent NN‐predicted corrections to the model forecast improves the overall quality of DA and improves the 10‐day forecast skill at all lead times. 
    more » « less
  2. The development of data-informed predictive models for dynamical systems is of widespread interest in many disciplines. We present a unifying framework for blending mechanistic and machine-learning approaches to identify dynamical systems from noisily and partially observed data. We compare pure data-driven learning with hybrid models which incorporate imperfect domain knowledge, referring to the discrepancy between an assumed truth model and the imperfect mechanistic model as model error. Our formulation is agnostic to the chosen machine learning model, is presented in both continuous- and discrete-time settings, and is compatible both with model errors that exhibit substantial memory and errors that are memoryless. First, we study memoryless linear (w.r.t. parametric-dependence) model error from a learning theory perspective, defining excess risk and generalization error. For ergodic continuous-time systems, we prove that both excess risk and generalization error are bounded above by terms that diminish with the square-root of T T , the time-interval over which training data is specified. Secondly, we study scenarios that benefit from modeling with memory, proving universal approximation theorems for two classes of continuous-time recurrent neural networks (RNNs): both can learn memory-dependent model error, assuming that it is governed by a finite-dimensional hidden variable and that, together, the observed and hidden variables form a continuous-time Markovian system. In addition, we connect one class of RNNs to reservoir computing, thereby relating learning of memory-dependent error to recent work on supervised learning between Banach spaces using random features. Numerical results are presented (Lorenz ’63, Lorenz ’96 Multiscale systems) to compare purely data-driven and hybrid approaches, finding hybrid methods less datahungry and more parametrically efficient. We also find that, while a continuous-time framing allows for robustness to irregular sampling and desirable domain- interpretability, a discrete-time framing can provide similar or better predictive performance, especially when data are undersampled and the vector field defining the true dynamics cannot be identified. Finally, we demonstrate numerically how data assimilation can be leveraged to learn hidden dynamics from noisy, partially-observed data, and illustrate challenges in representing memory by this approach, and in the training of such models. 
    more » « less
  3. Ranzato, M.; Beygelzimer, A.; Dauphin, Y.; Liang, P.S.; Vaughan, J. Wortman (Ed.)
    The accuracy of simulation-based forecasting in chaotic systems is heavily dependent on high-quality estimates of the system state at the beginning of the forecast. Data assimilation methods are used to infer these initial conditions by systematically combining noisy, incomplete observations and numerical models of system dynamics to produce highly effective estimation schemes. We introduce a self-supervised framework, which we call \textit{amortized assimilation}, for learning to assimilate in dynamical systems. Amortized assimilation combines deep learning-based denoising with differentiable simulation, using independent neural networks to assimilate specific observation types while connecting the gradient flow between these sub-tasks with differentiable simulation and shared recurrent memory. This hybrid architecture admits a self-supervised training objective which is minimized by an unbiased estimator of the true system state even in the presence of only noisy training data. Numerical experiments across several chaotic benchmark systems highlight the improved effectiveness of our approach compared to widely-used data assimilation methods. 
    more » « less
  4. Abstract Estimation of uncertainties (random error statistics) of radio occultation (RO) observations is important for their effective assimilation in numerical weather prediction (NWP) models. Average uncertainties can be estimated for large samples of RO observations and these statistics may be used for specifying the observation errors in NWP data assimilation. However, the uncertainties of individual RO observations vary, and so using average uncertainty estimates will overestimate the uncertainties of some observations and underestimate those of others, reducing their overall effectiveness in the assimilation. Several parameters associated with RO observations or their atmospheric environments have been proposed to estimate individual RO errors. These include the standard deviation of bending angle (BA) departures from either climatology in the upper stratosphere and lower mesosphere (STDV) or the sample mean between 40 and 60 km (STD4060), the local spectral width (LSW), and the magnitude of the horizontal gradient of refractivity (|∇HN|). In this paper we show how the uncertainties of two RO datasets, COSMIC-2 and Spire BA, as well as their combination, vary with these parameters. We find that the uncertainties are highly correlated with STDV and STD4060 in the stratosphere, and with LSW and |∇HN| in the lower troposphere. These results suggest a hybrid error model for individual BA observations that uses an average statistical model of RO errors modified by STDV or STD4060 above 30 km, and LSW or |∇HN| below 8 km. Significance StatementThese results contribute to the understanding of the sources of uncertainties in radio occultation observations. They could be used to improve the effectiveness of these observations in their assimilation into numerical weather prediction and reanalysis models by improving the estimation of their observational errors. 
    more » « less
  5. Quantum computing testbeds exhibit high-fidelity quantum control over small collections of qubits, enabling performance of precise, repeatable operations followed by measurements. Currently, these noisy intermediate-scale devices can support a sufficient number of sequential operations prior to decoherence such that near term algorithms can be performed with proximate accuracy (like chemical accuracy for quantum chemistry problems). While the results of these algorithms are imperfect, these imperfections can help bootstrap quantum computer testbed development. Demonstrations of these algorithms over the past few years, coupled with the idea that imperfect algorithm performance can be caused by several dominant noise sources in the quantum processor, which can be measured and calibrated during algorithm execution or in post-processing, has led to the use of noise mitigation to improve typical computational results. Conversely, benchmark algorithms coupled with noise mitigation can help diagnose the nature of the noise, whether systematic or purely random. Here, we outline the use of coherent noise mitigation techniques as a characterization tool in trapped-ion testbeds. We perform model-fitting of the noisy data to determine the noise source based on realistic physics focused noise models and demonstrate that systematic noise amplification coupled with error mitigation schemes provides useful data for noise model deduction. Further, in order to connect lower level noise model details with application specific performance of near term algorithms, we experimentally construct the loss landscape of a variational algorithm under various injected noise sources coupled with error mitigation techniques. This type of connection enables application-aware hardware codesign, in which the most important noise sources in specific applications, like quantum chemistry, become foci of improvement in subsequent hardware generations. 
    more » « less