skip to main content


Title: Empirical mode modeling: A data-driven approach to recover and forecast nonlinear dynamics from noisy data
Abstract Data-driven, model-free analytics are natural choices for discovery and forecasting of complex, nonlinear systems. Methods that operate in the system state-space require either an explicit multidimensional state-space, or, one approximated from available observations. Since observational data are frequently sampled with noise, it is possible that noise can corrupt the state-space representation degrading analytical performance. Here, we evaluate the synthesis of empirical mode decomposition with empirical dynamic modeling, which we term empirical mode modeling, to increase the information content of state-space representations in the presence of noise. Evaluation of a mathematical, and, an ecologically important geophysical application across three different state-space representations suggests that empirical mode modeling may be a useful technique for data-driven, model-free, state-space analysis in the presence of noise.  more » « less
Award ID(s):
1660584 1655203
NSF-PAR ID:
10331730
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Nonlinear Dynamics
Volume:
108
Issue:
3
ISSN:
0924-090X
Page Range / eLocation ID:
2147 to 2160
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Diffuse interstellar bands (DIBs) are broad absorption features associated with interstellar dust and can serve as chemical and kinematic tracers. Conventional measurements of DIBs in stellar spectra are complicated by residuals between observations and best-fit stellar models. To overcome this, we simultaneously model the spectrum as a combination of stellar, dust, and residual components, with full posteriors on the joint distribution of the components. This decomposition is obtained by modeling each component as a draw from a high-dimensional Gaussian distribution in the data space (the observed spectrum)—a method we call “Marginalized Analytic Data-space Gaussian Inference for Component Separation” (MADGICS). We use a data-driven prior for the stellar component, which avoids missing stellar features not well modeled by synthetic spectra. This technique provides statistically rigorous uncertainties and detection thresholds, which are required to work in the low signal-to-noise regime that is commonplace for dusty lines of sight. We reprocess all public Gaia DR3 RVS spectra and present an improved 8621 Å DIB catalog, free of detectable stellar line contamination. We constrain the rest-frame wavelength to 8623.14 ± 0.087 Å (vacuum), find no significant evidence for DIBs in the Local Bubble from the 1/6th of RVS spectra that are public, and show unprecedented correlation with kinematic substructure in Galactic CO maps. We validate the catalog, its reported uncertainties, and biases using synthetic injection tests. We believe MADGICS provides a viable path forward for large-scale spectral line measurements in the presence of complex spectral contamination.

     
    more » « less
  2. Transient growth and resolvent analyses are routinely used to assess nonasymptotic properties of fluid flows. In particular, resolvent analysis can be interpreted as a special case of viewing flow dynamics as an open system in which free-stream turbulence, surface roughness, and other irregularities provide sources of input forcing. We offer a comprehensive summary of the tools that can be employed to probe the dynamics of fluctuations around a laminar or turbulent base flow in the presence of such stochastic or deterministic input forcing and describe how input–output techniques enhance resolvent analysis. Specifically, physical insights that may remain hidden in the resolvent analysis are gained by detailed examination of input–output responses between spatially localized body forces and selected linear combinations of state variables. This differentiating feature plays a key role in quantifying the importance of different mechanisms for bypass transition in wall-bounded shear flows and in explaining how turbulent jets generate noise. We highlight the utility of a stochastic framework, with white or colored inputs, in addressing a variety of open challenges including transition in complex fluids, flow control, and physics-aware data-driven turbulence modeling. Applications with temporally or spatially periodic base flows are discussed and future research directions are outlined. 
    more » « less
  3. The Sun emits a stream of charged particles called the solar wind, which is the primary driver of space weather and geomagnetic disturbances. Modeling and observations complement each other to help us identify and understand the physical processes governing the solar wind dynamics on different scales. Numerical models of the solar wind have greatly improved in recent years with advances in computational infrastructure and by employing data-driven or data-assimilative approaches. Designed primarily for modeling the partially ionized space plasma using adaptive mesh refinement technique on Cartesian or spherical grids, the Multi-scale Fluid-kinetic Simulation Suite (MS-FLUKSS) is arguably one of the most sophisticated numerical codes for simulating the solar wind flow. To inform potential users and interested members of the space weather community, we present a brief summary of the current state of the solar wind models developed in the MS-FLUKSS framework, with an emphasis on the 3D heliospheric MHD models driven and constrained by remote/in situ observations and empirical coronal models such as the Wang-Sheeley-Arge model. We also discuss potential scientific and operational applications of our solar wind models on prediction of space weather (e.g., high speed streams, coronal mass ejections, and interplanetary shocks) throughout the solar system. 
    more » « less
  4. This article introduces an isometric manifold embedding data-driven paradigm designed to enable model-free simulations with noisy data sampled from a constitutive manifold. The proposed data-driven approach iterates between a global optimization problem that seeks admissible solutions for the balance principle and a local optimization problem that finds the closest point projection of the Euclidean space that isometrically embeds a nonlinear constitutive manifold. To de-noise the database, a geometric autoencoder is introduced such that the encoder first learns to create an approximated embedding that maps the underlying low-dimensional structure of the high-dimensional constitutive manifold onto a flattened manifold with less curvature. We then obtain the noise-free constitutive responses by projecting data onto a denoised latent space that is completely flat by assuming that the noise and the underlying constitutive signal are orthogonal to each other. Consequently, a projection from the conservative manifold onto this de-noised constitutive latent space enables us to complete the local optimization step of the data-driven paradigm. Finally, to decode the data expressed in the latent space without reintroducing noise, we impose a set of isometry constraints while training the autoencoder such that the nonlinear mapping from the latent space to the reconstructed constituent manifold is distance-preserving. Numerical examples are used to both validate the implementation and demonstrate the accuracy, robustness, and limitations of the proposed paradigm. 
    more » « less
  5. Models of many engineering and natural systems are imperfect. The discrepancy between the mathematical representations of a true physical system and its imperfect model is called the model error. These model errors can lead to substantial differences between the numerical solutions of the model and the state of the system, particularly in those involving nonlinear, multi-scale phenomena. Thus, there is increasing interest in reducing model errors, particularly by leveraging the rapidly growing observational data to understand their physics and sources. Here, we introduce a framework named MEDIDA: Model Error Discovery with Interpretability and Data Assimilation. MEDIDA only requires a working numerical solver of the model and a small number of noise-free or noisy sporadic observations of the system. In MEDIDA, first, the model error is estimated from differences between the observed states and model-predicted states (the latter are obtained from a number of one-time-step numerical integrations from the previous observed states). If observations are noisy, a data assimilation technique, such as the ensemble Kalman filter, is employed to provide the analysis state of the system, which is then used to estimate the model error. Finally, an equation-discovery technique, here the relevance vector machine, a sparsity-promoting Bayesian method, is used to identify an interpretable, parsimonious, and closed-form representation of the model error. Using the chaotic Kuramoto–Sivashinsky system as the test case, we demonstrate the excellent performance of MEDIDA in discovering different types of structural/parametric model errors, representing different types of missing physics, using noise-free and noisy observations.

     
    more » « less