- Award ID(s):
- 2108856
- PAR ID:
- 10329149
- Date Published:
- Journal Name:
- Chaos
- Volume:
- 32
- ISSN:
- 1089-7682
- Page Range / eLocation ID:
- 053122
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Discovering the underlying dynamics of complex systems from data is an important practical topic. Constrained optimization algorithms are widely utilized and lead to many successes. Yet, such purely data-driven methods may bring about incorrect physics in the presence of random noise and cannot easily handle the situation with incomplete data. In this paper, a new iterative learning algorithm for complex turbulent systems with partial observations is developed that alternates between identifying model structures, recovering unobserved variables, and estimating parameters. First, a causality-based learning approach is utilized for the sparse identification of model structures, which takes into account certain physics knowledge that is pre-learned from data. It has unique advantages in coping with indirect coupling between features and is robust to stochastic noise. A practical algorithm is designed to facilitate causal inference for high-dimensional systems. Next, a systematic nonlinear stochastic parameterization is built to characterize the time evolution of the unobserved variables. Closed analytic formula via efficient nonlinear data assimilation is exploited to sample the trajectories of the unobserved variables, which are then treated as synthetic observations to advance a rapid parameter estimation. Furthermore, the localization of the state variable dependence and the physics constraints are incorporated into the learning procedure. This mitigates the curse of dimensionality and prevents the finite time blow-up issue. Numerical experiments show that the new algorithm identifies the model structure and provides suitable stochastic parameterizations for many complex nonlinear systems with chaotic dynamics, spatiotemporal multiscale structures, intermittency, and extreme events.more » « less
-
We present a non‐Gaussian ensemble data assimilation method based on the maximum‐likelihood ensemble filter, which allows for any combination of Gaussian, lognormal, and reverse lognormal errors in both the background and the observations. The technique is fully nonlinear, does not require a tangent linear model, and uses a Hessian preconditioner to minimise the cost function efficiently in ensemble space. When the Gaussian assumption is relaxed, the results show significant improvements in the analysis skill within two atmospheric toy models, and the performance of data assimilation systems for (semi)bounded variables is expected to improve.more » « less
-
Abstract A hybrid data assimilation algorithm is developed for complex dynamical systems with partial observations. The method starts with applying a spectral decomposition to the entire spatiotemporal fields, followed by creating a machine learning model that builds a nonlinear map between the coefficients of observed and unobserved state variables for each spectral mode. A cheap low‐order nonlinear stochastic parameterized extended Kalman filter (SPEKF) model is employed as the forecast model in the ensemble Kalman filter to deal with each mode associated with the observed variables. The resulting ensemble members are then fed into the machine learning model to create an ensemble of the corresponding unobserved variables. In addition to the ensemble spread, the training residual in the machine learning‐induced nonlinear map is further incorporated into the state estimation, advancing the diagnostic quantification of the posterior uncertainty. The hybrid data assimilation algorithm is applied to a precipitating quasi‐geostrophic (PQG) model, which includes the effects of water vapor, clouds, and rainfall beyond the classical two‐level QG model. The complicated nonlinearities in the PQG equations prevent traditional methods from building simple and accurate reduced‐order forecast models. In contrast, the SPEKF forecast model is skillful in recovering the intermittent observed states, and the machine learning model effectively estimates the chaotic unobserved signals. Utilizing the calibrated SPEKF and machine learning models under a moderate cloud fraction, the resulting hybrid data assimilation remains reasonably accurate when applied to other geophysical scenarios with nearly clear skies or relatively heavy rainfall, implying the robustness of the algorithm for extrapolation.
-
null (Ed.)Parameter estimation for nonlinear dynamic system models, represented by ordinary differential equations (ODEs), using noisy and sparse data, is a vital task in many fields. We propose a fast and accurate method, manifold-constrained Gaussian process inference (MAGI), for this task. MAGI uses a Gaussian process model over time series data, explicitly conditioned on the manifold constraint that derivatives of the Gaussian process must satisfy the ODE system. By doing so, we completely bypass the need for numerical integration and achieve substantial savings in computational time. MAGI is also suitable for inference with unobserved system components, which often occur in real experiments. MAGI is distinct from existing approaches as we provide a principled statistical construction under a Bayesian framework, which incorporates the ODE system through the manifold constraint. We demonstrate the accuracy and speed of MAGI using realistic examples based on physical experiments.more » « less
-
Abstract Traditional ensemble Kalman filter data assimilation methods make implicit assumptions of Gaussianity and linearity that are strongly violated by many important Earth system applications. For instance, bounded quantities like the amount of a tracer and sea ice fractional coverage cannot be accurately represented by a Gaussian that is unbounded by definition. Nonlinear relations between observations and model state variables abound. Examples include the relation between a remotely sensed radiance and the column of atmospheric temperatures, or the relation between cloud amount and water vapor quantity. Part I of this paper described a very general data assimilation framework for computing observation increments for non-Gaussian prior distributions and likelihoods. These methods can respect bounds and other non-Gaussian aspects of observed variables. However, these benefits can be lost when observation increments are used to update state variables using the linear regression that is part of standard ensemble Kalman filter algorithms. Here, regression of observation increments is performed in a space where variables are transformed by the probit and probability integral transforms, a specific type of Gaussian anamorphosis. This method can enforce appropriate bounds for all quantities and deal much more effectively with nonlinear relations between observations and state variables. Important enhancements like localization and inflation can be performed in the transformed space. Results are provided for idealized bivariate distributions and for cycling assimilation in a low-order dynamical system. Implications for improved data assimilation across Earth system applications are discussed.