skip to main content

Title: The maximum likelihood ensemble smoother for the Kuramoto–Sivashinsky equation
Abstract Data assimilation (DA) aims to combine observations/data with a model to maximize the utility of information for obtaining the optimal estimate. The maximum likelihood ensemble filter (MLEF) is a sequential DA method or a filter-type method. Weaknesses of the filter method are assimilating time-integrated observations and estimating empirical parameter estimation. The reason is that the forward model is employed outside of the analysis procedure in this type of DA method. To overcome these weaknesses, the MLEF is now extended as a smoother and the novel maximum likelihood ensemble smoother (MLES) is proposed. The MLES is a smoothing method with variational-like qualities, specifically in the cost function. Rather than using the error information from a single temporal location to solve for the optimal analysis update as done by the MLEF, the MLES can include observations and the forward model within a chosen time window. The newly proposed DA method is first validated by a series of rigorous and thorough performance tests using the Lorenz 96 model. Then, as DA is known to be used extensively to increase the predictability of the commonly chaotic dynamical systems seen in meteorological applications, this study demonstrates the MLES with a model chaotic problem governed by the 1D Kuramoto–Sivashinky (KS) equation. Additionally, the MLES is shown to be an effective method in improving the estimate of uncertain empirical model parameters. The MLES and MLEF are then directly compared and it is shown that the performance of the MLES is adequate and that it is a good candidate for increasing the predictability of a chaotic dynamical system. Future work will focus on an extensive application of the MLES to highly turbulent flows.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
IMA Journal of Applied Mathematics
Page Range / eLocation ID:
935 to 963
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract. This paper presents the results of the ensemble Riemannian data assimilation for relatively high-dimensional nonlinear dynamical systems, focusing on the chaotic Lorenz-96 model and a two-layer quasi-geostrophic (QG) model of atmospheric circulation. The analysis state in this approach is inferred from a joint distribution that optimally couples the background probability distribution and the likelihood function, enabling formal treatment of systematic biases without any Gaussian assumptions. Despite the risk of the curse of dimensionality in the computation of the coupling distribution, comparisons with the classic implementation of the particle filter and the stochastic ensemble Kalman filter demonstrate that, with the same ensemble size, the presented methodology could improve the predictability of dynamical systems. In particular, under systematic errors, the root mean squared error of the analysis state can be reduced by 20 % (30 %) in the Lorenz-96 (QG) model. 
    more » « less
  2. Abstract

    Forecasting the El Niño-Southern Oscillation (ENSO) has been a subject of vigorous research due to the important role of the phenomenon in climate dynamics and its worldwide socioeconomic impacts. Over the past decades, numerous models for ENSO prediction have been developed, among which statistical models approximating ENSO evolution by linear dynamics have received significant attention owing to their simplicity and comparable forecast skill to first-principles models at short lead times. Yet, due to highly nonlinear and chaotic dynamics (particularly during ENSO initiation), such models have limited skill for longer-term forecasts beyond half a year. To resolve this limitation, here we employ a new nonparametric statistical approach based on analog forecasting, called kernel analog forecasting (KAF), which avoids assumptions on the underlying dynamics through the use of nonlinear kernel methods for machine learning and dimension reduction of high-dimensional datasets. Through a rigorous connection with Koopman operator theory for dynamical systems, KAF yields statistically optimal predictions of future ENSO states as conditional expectations, given noisy and potentially incomplete data at forecast initialization. Here, using industrial-era Indo-Pacific sea surface temperature (SST) as training data, the method is shown to successfully predict the Niño 3.4 index in a 1998–2017 verification period out to a 10-month lead, which corresponds to an increase of 3–8 months (depending on the decade) over a benchmark linear inverse model (LIM), while significantly improving upon the ENSO predictability “spring barrier”. In particular, KAF successfully predicts the historic 2015/16 El Niño at initialization times as early as June 2015, which is comparable to the skill of current dynamical models. An analysis of a 1300-yr control integration of a comprehensive climate model (CCSM4) further demonstrates that the enhanced predictability afforded by KAF holds over potentially much longer leads, extending to 24 months versus 18 months in the benchmark LIM. Probabilistic forecasts for the occurrence of El Niño/La Niña events are also performed and assessed via information-theoretic metrics, showing an improvement of skill over LIM approaches, thus opening an avenue for environmental risk assessment relevant in a variety of contexts.

    more » « less
  3. Models of many engineering and natural systems are imperfect. The discrepancy between the mathematical representations of a true physical system and its imperfect model is called the model error. These model errors can lead to substantial differences between the numerical solutions of the model and the state of the system, particularly in those involving nonlinear, multi-scale phenomena. Thus, there is increasing interest in reducing model errors, particularly by leveraging the rapidly growing observational data to understand their physics and sources. Here, we introduce a framework named MEDIDA: Model Error Discovery with Interpretability and Data Assimilation. MEDIDA only requires a working numerical solver of the model and a small number of noise-free or noisy sporadic observations of the system. In MEDIDA, first, the model error is estimated from differences between the observed states and model-predicted states (the latter are obtained from a number of one-time-step numerical integrations from the previous observed states). If observations are noisy, a data assimilation technique, such as the ensemble Kalman filter, is employed to provide the analysis state of the system, which is then used to estimate the model error. Finally, an equation-discovery technique, here the relevance vector machine, a sparsity-promoting Bayesian method, is used to identify an interpretable, parsimonious, and closed-form representation of the model error. Using the chaotic Kuramoto–Sivashinsky system as the test case, we demonstrate the excellent performance of MEDIDA in discovering different types of structural/parametric model errors, representing different types of missing physics, using noise-free and noisy observations.

    more » « less
  4. Abstract

    Localization is essential to effectively assimilate satellite radiances in ensemble Kalman filters. However, the vertical location and separation from a model grid point variable for a radiance observation are not well defined, which results in complexities when localizing the impact of radiance observations. An adaptive method is proposed to estimate an effective vertical localization independently for each assimilated channel of every satellite platform. It uses sample correlations between ensemble priors of observations and state variables from a cycling data assimilation to estimate the localization function that minimizes the sampling error. The estimated localization functions are approximated by three localization parameters: the localization width, maximum value, and vertical location of the radiance observations. Adaptively estimated localization parameters are used in assimilation experiments with the National Centers for Environmental Prediction (NCEP) Global Forecast System (GFS) model and the National Oceanic and Atmospheric Administration (NOAA) operational ensemble Kalman filter (EnKF). Results show that using the adaptive localization width and vertical location for radiance observations is more beneficial than also including the maximum localization value. The experiment using the adaptively estimated localization width and vertical location performs better than the default Gaspari and Cohn (GC) experiment, and produces similar errors to the optimal GC experiment. The adaptive localization parameters can be computed during the assimilation procedure, so the computational cost needed to tune the optimal GC localization width is saved.

    more » « less
  5. Bucchignani, Edoardo ; Williams, Paul D. (Ed.)

    Stratospheric dynamics are strongly affected by the absorption/emission of radiation in the Earth’s atmosphere and Rossby waves that propagate upward from the troposphere, perturbing the zonal flow. Reduced order models of stratospheric wave–zonal interactions, which parameterize these effects, have been used to study interannual variability in stratospheric zonal winds and sudden stratospheric warming (SSW) events. These models are most sensitive to two main parameters: Λ, forcing the mean radiative zonal wind gradient, and h, a perturbation parameter representing the effect of Rossby waves. We take one such reduced order model with 20 years of ECMWF atmospheric reanalysis data and estimate Λ and h using both a particle filter and an ensemble smoother to investigate if the highly-simplified model can accurately reproduce the averaged reanalysis data and which parameter properties may be required to do so. We find that by allowing additional complexity via an unparameterized Λ(t), the model output can closely match the reanalysis data while maintaining behavior consistent with the dynamical properties of the reduced-order model. Furthermore, our analysis shows physical signatures in the parameter estimates around known SSW events. This work provides a data-driven examination of these important parameters representing fundamental stratospheric processes through the lens and tractability of a reduced order model, shown to be physically representative of the relevant atmospheric dynamics.

    more » « less