skip to main content


Title: Ensemble Riemannian data assimilation: towards large-scale dynamical systems
Abstract. This paper presents the results of the ensemble Riemannian data assimilation for relatively high-dimensional nonlinear dynamical systems, focusing on the chaotic Lorenz-96 model and a two-layer quasi-geostrophic (QG) model of atmospheric circulation. The analysis state in this approach is inferred from a joint distribution that optimally couples the background probability distribution and the likelihood function, enabling formal treatment of systematic biases without any Gaussian assumptions. Despite the risk of the curse of dimensionality in the computation of the coupling distribution, comparisons with the classic implementation of the particle filter and the stochastic ensemble Kalman filter demonstrate that, with the same ensemble size, the presented methodology could improve the predictability of dynamical systems. In particular, under systematic errors, the root mean squared error of the analysis state can be reduced by 20 % (30 %) in the Lorenz-96 (QG) model.  more » « less
Award ID(s):
1830418 1839441 1839336
NSF-PAR ID:
10384610
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Nonlinear Processes in Geophysics
Volume:
29
Issue:
1
ISSN:
1607-7946
Page Range / eLocation ID:
77 to 92
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Data assimilation (DA) aims to combine observations/data with a model to maximize the utility of information for obtaining the optimal estimate. The maximum likelihood ensemble filter (MLEF) is a sequential DA method or a filter-type method. Weaknesses of the filter method are assimilating time-integrated observations and estimating empirical parameter estimation. The reason is that the forward model is employed outside of the analysis procedure in this type of DA method. To overcome these weaknesses, the MLEF is now extended as a smoother and the novel maximum likelihood ensemble smoother (MLES) is proposed. The MLES is a smoothing method with variational-like qualities, specifically in the cost function. Rather than using the error information from a single temporal location to solve for the optimal analysis update as done by the MLEF, the MLES can include observations and the forward model within a chosen time window. The newly proposed DA method is first validated by a series of rigorous and thorough performance tests using the Lorenz 96 model. Then, as DA is known to be used extensively to increase the predictability of the commonly chaotic dynamical systems seen in meteorological applications, this study demonstrates the MLES with a model chaotic problem governed by the 1D Kuramoto–Sivashinky (KS) equation. Additionally, the MLES is shown to be an effective method in improving the estimate of uncertain empirical model parameters. The MLES and MLEF are then directly compared and it is shown that the performance of the MLES is adequate and that it is a good candidate for increasing the predictability of a chaotic dynamical system. Future work will focus on an extensive application of the MLES to highly turbulent flows. 
    more » « less
  2. Abstract. Rejuvenation in particle filters is necessary to prevent the collapse of the weights when the number of particles is insufficient to properly sample the high-probability regions of the state space. Rejuvenation is often implemented in a heuristic manner by the addition of random noise that widens the support of the ensemble. This work aims at improving canonical rejuvenation methodology by the introduction of additional prior information obtained from climatological samples; the dynamical particles used for importance sampling are augmented with samples obtained from stochastic covariance shrinkage. A localized variant of the proposed method is developed.Numerical experiments with the Lorenz '63 model show that modified filters significantly improve the analyses for low dynamical ensemble sizes. Furthermore, localization experiments with the Lorenz '96 model show that the proposed methodology is extendable to larger systems. 
    more » « less
  3. Abstract

    A hybrid data assimilation algorithm is developed for complex dynamical systems with partial observations. The method starts with applying a spectral decomposition to the entire spatiotemporal fields, followed by creating a machine learning model that builds a nonlinear map between the coefficients of observed and unobserved state variables for each spectral mode. A cheap low‐order nonlinear stochastic parameterized extended Kalman filter (SPEKF) model is employed as the forecast model in the ensemble Kalman filter to deal with each mode associated with the observed variables. The resulting ensemble members are then fed into the machine learning model to create an ensemble of the corresponding unobserved variables. In addition to the ensemble spread, the training residual in the machine learning‐induced nonlinear map is further incorporated into the state estimation, advancing the diagnostic quantification of the posterior uncertainty. The hybrid data assimilation algorithm is applied to a precipitating quasi‐geostrophic (PQG) model, which includes the effects of water vapor, clouds, and rainfall beyond the classical two‐level QG model. The complicated nonlinearities in the PQG equations prevent traditional methods from building simple and accurate reduced‐order forecast models. In contrast, the SPEKF forecast model is skillful in recovering the intermittent observed states, and the machine learning model effectively estimates the chaotic unobserved signals. Utilizing the calibrated SPEKF and machine learning models under a moderate cloud fraction, the resulting hybrid data assimilation remains reasonably accurate when applied to other geophysical scenarios with nearly clear skies or relatively heavy rainfall, implying the robustness of the algorithm for extrapolation.

     
    more » « less
  4. Abstract

    Long‐lead forecasting for spatio‐temporal systems can entail complex nonlinear dynamics that are difficult to specify a priori. Current statistical methodologies for modeling these processes are often highly parameterized and, thus, challenging to implement from a computational perspective. One potential parsimonious solution to this problem is a method from the dynamical systems and engineering literature referred to as an echo state network (ESN). ESN models usereservoir computingto efficiently compute recurrent neural network forecasts. Moreover, multilevel (deep) hierarchical models have recently been shown to be successful at predicting high‐dimensional complex nonlinear processes, particularly those with multiple spatial and temporal scales of variability (such as those we often find in spatio‐temporal environmental data). Here, we introduce a deep ensemble ESN (D‐EESN) model. Despite the incorporation of a deep structure, the presented model is computationally efficient. We present two versions of this model for spatio‐temporal processes that produce forecasts and associated measures of uncertainty. The first approach utilizes a bootstrap ensemble framework, and the second is developed within a hierarchical Bayesian framework (BD‐EESN). This more general hierarchical Bayesian framework naturally accommodates non‐Gaussian data types and multiple levels of uncertainties. The methodology is first applied to a data set simulated from a novel non‐Gaussian multiscale Lorenz‐96 dynamical system simulation model and, then, to a long‐lead United States (U.S.) soil moisture forecasting application. Across both applications, the proposed methodology improves upon existing methods in terms of both forecast accuracy and quantifying uncertainty.

     
    more » « less
  5. null (Ed.)
    Recent advances in computing algorithms and hardware have rekindled interest in developing high-accuracy, low-cost surrogate models for simulating physical systems. The idea is to replace expensive numerical integration of complex coupled partial differential equations at fine time scales performed on supercomputers, with machine-learned surrogates that efficiently and accurately forecast future system states using data sampled from the underlying system. One particularly popular technique being explored within the weather and climate modelling community is the echo state network (ESN), an attractive alternative to other well-known deep learning architectures. Using the classical Lorenz 63 system, and the three tier multi-scale Lorenz 96 system (Thornes T, Duben P, Palmer T. 2017 Q. J. R. Meteorol. Soc. 143 , 897–908. ( doi:10.1002/qj.2974 )) as benchmarks, we realize that previously studied state-of-the-art ESNs operate in two distinct regimes, corresponding to low and high spectral radius (LSR/HSR) for the sparse, randomly generated, reservoir recurrence matrix. Using knowledge of the mathematical structure of the Lorenz systems along with systematic ablation and hyperparameter sensitivity analyses, we show that state-of-the-art LSR-ESNs reduce to a polynomial regression model which we call Domain-Driven Regularized Regression (D2R2). Interestingly, D2R2 is a generalization of the well-known SINDy algorithm (Brunton SL, Proctor JL, Kutz JN. 2016 Proc. Natl Acad. Sci. USA 113 , 3932–3937. ( doi:10.1073/pnas.1517384113 )). We also show experimentally that LSR-ESNs (Chattopadhyay A, Hassanzadeh P, Subramanian D. 2019 ( http://arxiv.org/abs/1906.08829 )) outperform HSR ESNs (Pathak J, Hunt B, Girvan M, Lu Z, Ott E. 2018 Phys. Rev. Lett. 120 , 024102. ( doi:10.1103/PhysRevLett.120.024102 )) while D2R2 dominates both approaches. A significant goal in constructing surrogates is to cope with barriers to scaling in weather prediction and simulation of dynamical systems that are imposed by time and energy consumption in supercomputers. Inexact computing has emerged as a novel approach to helping with scaling. In this paper, we evaluate the performance of three models (LSR-ESN, HSR-ESN and D2R2) by varying the precision or word size of the computation as our inexactness-controlling parameter. For precisions of 64, 32 and 16 bits, we show that, surprisingly, the least expensive D2R2 method yields the most robust results and the greatest savings compared to ESNs. Specifically, D2R2 achieves 68 × in computational savings, with an additional 2 × if precision reductions are also employed, outperforming ESN variants by a large margin. This article is part of the theme issue ‘Machine learning for weather and climate modelling’. 
    more » « less