skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Investigating SINDy as a Tool for Causal Discovery in Time Series Signals
The SINDy algorithm has been successfully used to identify the governing equations of dynamical systems from time series data. In this paper, we argue that this makes SINDy a potentially useful tool for causal discovery and that existing tools for causal discovery can be used to dramatically improve the performance of SINDy as tool for robust sparse modeling and system identification. We then demonstrate empirically that augmenting the SINDy algorithm with tools from causal discovery can provides engineers with a tool for learning causally robust governing equations.  more » « less
Award ID(s):
1954364
PAR ID:
10417994
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Page Range / eLocation ID:
1 to 5
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Dynamical equations form the basis of design for manufacturing processes and control systems; however, identifying governing equations using a mechanistic approach is tedious. Recently, Machine learning (ML) has shown promise to identify the governing dynamical equations for physical systems faster. This possibility of rapid identification of governing equations provides an exciting opportunity for advancing dynamical systems modeling. However, applicability of the ML approach in identifying governing mechanisms for the dynamics of complex systems relevant to manufacturing has not been tested. We test and compare the efficacy of two white-box ML approaches (SINDy and SymReg) for predicting dynamics and structure of dynamical equations for overall dynamics in a distillation column. Results demonstrate that a combination of ML approaches should be used to identify a full range of equations. In terms of physical law, few terms were interpretable as related to Fick’s law of diffusion and Henry’s law in SINDy, whereas SymReg identified energy balance as driving dynamics. 
    more » « less
  2. Discovering governing physical laws from noisy data is a grand challenge in many science and engineering research areas. We present a new approach to data-driven discovery of ordinary differential equations (ODEs) and partial differential equations (PDEs), in explicit or implicit form. We demonstrate our approach on a wide range of problems, including shallow water equations and Navier–Stokes equations. The key idea is to select candidate terms for the underlying equations using dimensional analysis, and to approximate the weights of the terms with error bars using our threshold sparse Bayesian regression. This new algorithm employs Bayesian inference to tune the hyperparameters automatically. Our approach is effective, robust and able to quantify uncertainties by providing an error bar for each discovered candidate equation. The effectiveness of our algorithm is demonstrated through a collection of classical ODEs and PDEs. Numerical experiments demonstrate the robustness of our algorithm with respect to noisy data and its ability to discover various candidate equations with error bars that represent the quantified uncertainties. Detailed comparisons with the sequential threshold least-squares algorithm and the lasso algorithm are studied from noisy time-series measurements and indicate that the proposed method provides more robust and accurate results. In addition, the data-driven prediction of dynamics with error bars using discovered governing physical laws is more accurate and robust than classical polynomial regressions. 
    more » « less
  3. Alber, Mark (Ed.)
    Biological systems exhibit complex dynamics that differential equations can often adeptly represent. Ordinary differential equation models are widespread; until recently their construction has required extensive prior knowledge of the system. Machine learning methods offer alternative means of model construction: differential equation models can be learnt from data via model discovery using sparse identification of nonlinear dynamics (SINDy). However, SINDy struggles with realistic levels of biological noise and is limited in its ability to incorporate prior knowledge of the system. We propose a data-driven framework for model discovery and model selection using hybrid dynamical systems: partial models containing missing terms. Neural networks are used to approximate the unknown dynamics of a system, enabling the denoising of the data while simultaneously learning the latent dynamics. Simulations from the fitted neural network are then used to infer models using sparse regression. We show, via model selection, that model discovery using hybrid dynamical systems outperforms alternative approaches. We find it possible to infer models correctly up to high levels of biological noise of different types. We demonstrate the potential to learn models from sparse, noisy data in application to a canonical cell state transition using data derived from single-cell transcriptomics. Overall, this approach provides a practical framework for model discovery in biology in cases where data are noisy and sparse, of particular utility when the underlying biological mechanisms are partially but incompletely known. 
    more » « less
  4. Abstract This study presents a data-driven framework for modeling complex systems, with a specific emphasis on traffic modeling. Traditional methods in traffic modeling often rely on assumptions regarding vehicle interactions. Our approach comprises two steps: first, utilizing information- theoretic (IT) tools to identify interaction directions and candidate variables thus eliminating assumptions, and second, employing the sparse identification of nonlinear systems (SINDy) tool to establish functional relationships. We validate the framework’s efficacy using synthetic data from two distinct traffic models, while considering measurement noise. Results show that IT tools can reliably detect directions of interaction as well as instances of no interaction. SINDy proves instrumental in creating precise functional relationships and determining coefficients in tested models. The innovation of our framework lies in its ability to use data-driven approach to model traffic dynamics without relying on assumptions, thus offering applications in various complex systems beyond traffic. 
    more » « less
  5. Gustau Camps-Valls; Francisco J. R. Ruiz; Isabel Valera (Ed.)
    Robins et al. (2008) introduced a class of influence functions (IFs) which could be used to obtain doubly robust moment functions for the corresponding parameters. However, that class does not include the IF of parameters for which the nuisance functions are solutions to integral equations. Such parameters are particularly important in the field of causal inference, specifically in the recently proposed proximal causal inference framework of Tchetgen Tchetgen et al. (2020), which allows for estimating the causal effect in the presence of latent confounders. In this paper, we first extend the class of Robins et al. to include doubly robust IFs in which the nuisance functions are solutions to integral equations. Then we demonstrate that the double robustness property of these IFs can be leveraged to construct estimating equations for the nuisance functions, which enables us to solve the integral equations without resorting to parametric models. We frame the estimation of the nuisance functions as a minimax optimization problem. We provide convergence rates for the nuisance functions and conditions required for asymptotic linearity of the estimator of the parameter of interest. The experiment results demonstrate that our proposed methodology leads to robust and high-performance estimators for average causal effect in the proximal causal inference framework. 
    more » « less