skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Review of Data‐Driven Discovery for Dynamic Systems
Summary Many real‐world scientific processes are governed by complex non‐linear dynamic systems that can be represented by differential equations. Recently, there has been an increased interest in learning, or discovering, the forms of the equations driving these complex non‐linear dynamic systems using data‐driven approaches. In this paper, we review the current literature on data‐driven discovery for dynamic systems. We provide a categorisation to the different approaches for data‐driven discovery and a unified mathematical framework to show the relationship between the approaches. Importantly, we discuss the role of statistics in the data‐driven discovery field, describe a possible approach by which the problem can be cast in a statistical framework and provide avenues for future work.  more » « less
Award ID(s):
1853096
PAR ID:
10473084
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
International Statistical Review
Volume:
91
Issue:
3
ISSN:
0306-7734
Format(s):
Medium: X Size: p. 464-492
Size(s):
p. 464-492
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract This paper explores the Glauber dynamics of spin systems with asymmetric coupling, a scenario that inherently violates detailed balance, leading to non-equilibrium steady states. By focusing on weighted and heterogeneous networks, we extend the applicability of Glauber models to capture complex real-world interactions, such as those seen in multilayer and hierarchical systems. Under specific assumptions on the coupling matrix, we demonstrate the tractability of these dynamics in the limit as the number of spins approaches infinity. Our results highlight the influence of network topology on dynamic behavior and provide a framework for analyzing stochastic processes in diverse applications, from statistical mechanics to data-driven modeling in applied sciences. The approach also uncovers potential for leveraging non-equilibrium dynamics in machine learning and network analysis. 
    more » « less
  2. Abstract In social science, formal and quantitative models, ranging from ones that describe economic growth to collective action, are used to formulate mechanistic explanations of the observed phenomena, provide predictions, and uncover new research questions. Here, we demonstrate the use of a machine learning system to aid the discovery of symbolic models that capture non-linear and dynamical relationships in social science datasets. By extending neuro-symbolic methods to find compact functions and differential equations in noisy and longitudinal data, we show that our system can be used to discover interpretable models from real-world data in economics and sociology. Augmenting existing workflows with symbolic regression can help uncover novel relationships and explore counterfactual models during the scientific process. We propose that this AI-assisted framework can bridge parametric and non-parametric models commonly employed in social science research by systematically exploring the space of non-linear models and enabling fine-grained control over expressivity and interpretability. 
    more » « less
  3. Alber, Mark (Ed.)
    Biological systems exhibit complex dynamics that differential equations can often adeptly represent. Ordinary differential equation models are widespread; until recently their construction has required extensive prior knowledge of the system. Machine learning methods offer alternative means of model construction: differential equation models can be learnt from data via model discovery using sparse identification of nonlinear dynamics (SINDy). However, SINDy struggles with realistic levels of biological noise and is limited in its ability to incorporate prior knowledge of the system. We propose a data-driven framework for model discovery and model selection using hybrid dynamical systems: partial models containing missing terms. Neural networks are used to approximate the unknown dynamics of a system, enabling the denoising of the data while simultaneously learning the latent dynamics. Simulations from the fitted neural network are then used to infer models using sparse regression. We show, via model selection, that model discovery using hybrid dynamical systems outperforms alternative approaches. We find it possible to infer models correctly up to high levels of biological noise of different types. We demonstrate the potential to learn models from sparse, noisy data in application to a canonical cell state transition using data derived from single-cell transcriptomics. Overall, this approach provides a practical framework for model discovery in biology in cases where data are noisy and sparse, of particular utility when the underlying biological mechanisms are partially but incompletely known. 
    more » « less
  4. In complex physical systems, conventional differential equations fall short in capturing non-local and memory effects. Fractional differential equations (FDEs) effectively model long-range interactions with fewer parameters. However, deriving FDEs from physical principles remains a significant challenge. This study introduces a stepwise data-driven framework to discover explicit expressions of FDEs directly from data. The proposed framework combines deep neural networks for data reconstruction and automatic differentiation with Gauss-Jacobi quadrature for fractional derivative approximation, effectively handling singularities while achieving fast, high-precision computations across large temporal/spatial scales. To optimize both linear coefficients and the nonlinear fractional orders, we employ an alternating optimization approach that combines sparse regression with global optimization techniques. We validate the framework on various datasets, including synthetic anomalous diffusion data, experimental data on the creep behavior of frozen soils, and single-particle trajectories modeled by Lévy motion. Results demonstrate the framework’s robustness in identifying FDE structures across diverse noise levels and its ability to capture integer order dynamics, offering a flexible approach for modeling memory effects in complex systems. 
    more » « less
  5. Knowledge discovery and information extraction of large and complex datasets has attracted great attention in wide-ranging areas from statistics and biology to medicine. Tools from machine learning, data mining, and neurocomputing have been extensively explored and utilized to accomplish such compelling data analytics tasks. However, for time-series data presenting active dynamic characteristics, many of the state-of-the-art techniques may not perform well in capturing the inherited temporal structures in these data. In this paper, integrating the Koopman operator and linear dynamical systems theory with support vector machines (SVMs), we develop a novel dynamic data mining framework to construct low-dimensional linear models that approximate the nonlinear flow of high-dimensional time-series data generated by unknown nonlinear dynamical systems. This framework then immediately enables pattern recognition, e.g., classification, of complex time-series data to distinguish their dynamic behaviors by using the trajectories generated by the reduced linear systems. Moreover, we demonstrate the applicability and efficiency of this framework through the problems of time-series classification in bioinformatics and healthcare, including cognitive classification and seizure detection with fMRI and EEG data, respectively. The developed Koopman dynamic learning framework then lays a solid foundation for effective dynamic data mining and promises a mathematically justified method for extracting the dynamics and significant temporal structures of nonlinear dynamical systems. 
    more » « less