skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, December 13 until 2:00 AM ET on Saturday, December 14 due to maintenance. We apologize for the inconvenience.


Title: Robust learning from noisy, incomplete, high-dimensional experimental data via physically constrained symbolic regression
Abstract

Machine learning offers an intriguing alternative to first-principle analysis for discovering new physics from experimental data. However, to date, purely data-driven methods have only proven successful in uncovering physical laws describing simple, low-dimensional systems with low levels of noise. Here we demonstrate that combining a data-driven methodology with some general physical principles enables discovery of a quantitatively accurate model of a non-equilibrium spatially extended system from high-dimensional data that is both noisy and incomplete. We illustrate this using an experimental weakly turbulent fluid flow where only the velocity field is accessible. We also show that this hybrid approach allows reconstruction of the inaccessible variables – the pressure and forcing field driving the flow.

 
more » « less
Award ID(s):
1725587
PAR ID:
10231943
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Nature Communications
Volume:
12
Issue:
1
ISSN:
2041-1723
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Identification of a heterogeneous conductivity field and reconstruction of a contaminant release history are key aspects of subsurface remediation. These two goals are achieved by combining model predictions with sparse and noisy hydraulic head and concentration measurements. Solution of this inverse problem is notoriously difficult due to, in part, high dimensionality of the parameter space and high computational cost of repeated forward solves. We use a convolutional adversarial autoencoder (CAAE) to parameterize a heterogeneous non‐Gaussian conductivity field via a low‐dimensional latent representation. A three‐dimensional dense convolutional encoder‐decoder (DenseED) network serves as a forward surrogate of the flow and transport model. The CAAE‐DenseED surrogate is fed into the ensemble smoother with multiple data assimilation (ESMDA) algorithm to sample from the Bayesian posterior distribution of the unknown parameters, forming a CAAE‐DenseED‐ESMDA inversion framework. The resulting CAAE‐DenseED‐ESMDA inversion strategy is used to identify a three‐dimensional contaminant source and conductivity field. A comparison of the inversion results from CAAE‐ESMDA with physical flow and transport simulator and from CAAE‐DenseED‐ESMDA shows that the latter yields accurate reconstruction results at the fraction of the computational cost of the former.

     
    more » « less
  2. Abstract

    Dimensionless numbers and scaling laws provide elegant insights into the characteristic properties of physical systems. Classical dimensional analysis and similitude theory fail to identify a set of unique dimensionless numbers for a highly multi-variable system with incomplete governing equations. This paper introduces a mechanistic data-driven approach that embeds the principle of dimensional invariance into a two-level machine learning scheme to automatically discover dominant dimensionless numbers and governing laws (including scaling laws and differential equations) from scarce measurement data. The proposed methodology, called dimensionless learning, is a physics-based dimension reduction technique. It can reduce high-dimensional parameter spaces to descriptions involving only a few physically interpretable dimensionless parameters, greatly simplifying complex process design and system optimization. We demonstrate the algorithm by solving several challenging engineering problems with noisy experimental measurements (not synthetic data) collected from the literature. Examples include turbulent Rayleigh-Bénard convection, vapor depression dynamics in laser melting of metals, and porosity formation in 3D printing. Lastly, we show that the proposed approach can identify dimensionally homogeneous differential equations with dimensionless number(s) by leveraging sparsity-promoting techniques.

     
    more » « less
  3. null (Ed.)
    High-fidelity blood flow modelling is crucial for enhancing our understanding of cardiovascular disease. Despite significant advances in computational and experimental characterization of blood flow, the knowledge that we can acquire from such investigations remains limited by the presence of uncertainty in parameters, low resolution, and measurement noise. Additionally, extracting useful information from these datasets is challenging. Data-driven modelling techniques have the potential to overcome these challenges and transform cardiovascular flow modelling. Here, we review several data-driven modelling techniques, highlight the common ideas and principles that emerge across numerous such techniques, and provide illustrative examples of how they could be used in the context of cardiovascular fluid mechanics. In particular, we discuss principal component analysis (PCA), robust PCA, compressed sensing, the Kalman filter for data assimilation, low-rank data recovery, and several additional methods for reduced-order modelling of cardiovascular flows, including the dynamic mode decomposition and the sparse identification of nonlinear dynamics. All techniques are presented in the context of cardiovascular flows with simple examples. These data-driven modelling techniques have the potential to transform computational and experimental cardiovascular research, and we discuss challenges and opportunities in applying these techniques in the field, looking ultimately towards data-driven patient-specific blood flow modelling. 
    more » « less
  4. The phase-field method is an attractive computational tool for simulating microstructural evolution during phase separation, including solidification and spinodal decomposition. However, the high computational cost associated with solving phase-field equations currently limits our ability to comprehend phase transformations. This article reports a novel phase-field emulator based on the tensor decomposition of the evolving microstructures and their corresponding two-point correlation functions to predict microstructural evolution at arbitrarily small time scales that are otherwise nontrivial to achieve using traditional phase-field approaches. The reported technique is based on obtaining a low-dimensional representation of the microstructures via tensor decomposition, and subsequently, predicting the microstructure evolution in the low-dimensional space using Gaussian process regression (GPR). Once we obtain the microstructure prediction in the low-dimensional space, we employ a hybrid input–output phase-retrieval algorithm to reconstruct the microstructures. As proof of concept, we present the results on microstructure prediction for spinodal decomposition, although the method itself is agnostic of the material parameters. Results show that we are able to predict microstructure evolution sequences that closely resemble the true microstructures (average normalized mean square of 6.78×10^−7) at time scales half of that employed in obtaining training data. Our data-driven microstructure emulator opens new avenues to predict the microstructural evolution by leveraging phase-field simulations and physical experimentation where the time resolution is often quite large due to limited resources and physical constraints, such as the phase coarsening experiments previously performed in microgravity. 
    more » « less
  5. The near-field characteristics of highly buoyant plumes, commonly referred to as lazy plumes, remain relatively poorly understood across a range of flow conditions, particularly compared with our understanding of far-field characteristics. Here, we perform fully resolved three-dimensional numerical simulations of round helium plumes to characterize the effects of different inlet Richardson, ${Ri}_0$ , and Reynolds, ${Re}_0$ , numbers on first- and second-order statistical moments as well as average vertical fluxes in the near field. For sufficiently high ${Re}_0$ at a particular ${Ri}_0$ , heavy air can penetrate the core of the plume, reminiscent of spikes in the classical Rayleigh–Taylor instability. In the most turbulent simulation, this penetration becomes so strong that a recirculation zone forms along the centreline of the plume. Vertical fluxes are found to scale linearly with vertical distance from the plume inlet, consistent with experimental and numerical observations (Jiang & Luo, Flow Turbul. Combust. , vol. 64, 2000, pp. 43–69; Kaye & Hunt, Intl J. Heat Fluid Flow , vol. 30, 2009, pp. 1099–1105). We analytically derive this linear scaling from the governing equations by making a radial entrainment hypothesis whereby ambient fluid is entrained, on average, only in the radial direction at a finite distance from the inlet. Through this derivation, we identify physical mechanisms that can cause these relationships to remain only approximately valid for the present simulations. Lastly, we identify near-field power-law scaling relations for the flux magnitudes based on ${Ri}_0$ , and also examine vertical profiles of the non-dimensional Richardson number flux. Ultimately, insights from the present simulations are used to define near-, intermediate- and far-field regions in buoyant plumes. 
    more » « less