skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: On sparse regression, L p ‐regularization, and automated model discovery
Sparse regression and feature extraction are the cornerstones of knowledge discovery from massive data. Their goal is to discover interpretable and predictive models that provide simple relationships among scientific variables. While the statistical tools for model discovery are well established in the context of linear regression, their generalization to nonlinear regression in material modeling is highly problem‐specific and insufficiently understood. Here we explore the potential of neural networks for automatic model discovery and induce sparsity by a hybrid approach that combines two strategies: regularization and physical constraints. We integrate the concept of Lp regularization for subset selection with constitutive neural networks that leverage our domain knowledge in kinematics and thermodynamics. We train our networks with both, synthetic and real data, and perform several thousand discovery runs to infer common guidelines and trends: L2 regularization or ridge regression is unsuitable for model discovery; L1 regularization or lasso promotes sparsity, but induces strong bias that may aggressively change the results; only L0 regularization allows us to transparently fine‐tune the trade‐off between interpretability and predictability, simplicity and accuracy, and bias and variance. With these insights, we demonstrate that Lp regularized constitutive neural networks can simultaneously discover both, interpretable models and physically meaningful parameters. We anticipate that our findings will generalize to alternative discovery techniques such as sparse and symbolic regression, and to other domains such as biology, chemistry, or medicine. Our ability to automatically discover material models from data could have tremendous applications in generative material design and open new opportunities to manipulate matter, alter properties of existing materials, and discover new materials with user‐defined properties.  more » « less
Award ID(s):
2320933
PAR ID:
10512698
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Wiley
Date Published:
Journal Name:
International Journal for Numerical Methods in Engineering
Volume:
125
Issue:
14
ISSN:
0029-5981
Page Range / eLocation ID:
e7481
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Personalized computational simulations have emerged as a vital tool to understand the biomechanical factors of a disease, predict disease progression, and design personalized intervention. Material modeling is critical for realistic biomedical simulations, and poor model selection can have life-threatening consequences for the patient. However, selecting the best model requires a profound domain knowledge and is limited to a few highly specialized experts in the field. Here we explore the feasibility of eliminating user involvement and automate the process of material modeling in finite element analyses. We leverage recent developments in constitutive neural networks, machine learning, and artificial intelligence to discover the best constitutive model from thousands of possible combinations of a few functional building blocks. We integrate all discoverable models into the finite element workflow by creating a universal material subroutine that contains more than 60,000 models, made up of 16 individual terms. We prototype this workflow using biaxial extension tests from healthy human arteries as input and stress and stretch profiles across the human aortic arch as output. Our results suggest that constitutive neural networks can robustly discover various flavors of arterial models from data, feed these models directly into a finite element simulation, and predict stress and strain profiles that compare favorably to the classical Holzapfel model. Replacing dozens of individual material subroutines by a single universal material subroutine—populated directly via automated model discovery—will make finite element simulations more user-friendly, more robust, and less vulnerable to human error. Democratizing finite element simulation by automating model selection could induce a paradigm shift in physics-based modeling, broaden access to simulation technologies, and empower individuals with varying levels of expertise and diverse backgrounds to actively participate in scientific discovery and push the boundaries of biomedical simulation. 
    more » « less
  2. Abstract Accurate modeling of cardiovascular tissues is crucial for understanding and predicting their behavior in various physiological and pathological conditions. In this study, we specifically focus on the pulmonary artery in the context of the Ross procedure, using neural networks to discover the most suitable material model. The Ross procedure is a complex cardiac surgery where the patient’s own pulmonary valve is used to replace the diseased aortic valve. Ensuring the successful long-term outcomes of this intervention requires a detailed understanding of the mechanical properties of pulmonary tissue. Constitutive artificial neural networks offer a novel approach to capture such complex stress–strain relationships. Here, we design and train different constitutive neural networks to characterize the hyperelastic, anisotropic behavior of the main pulmonary artery. Informed by experimental biaxial testing data under various axial-circumferential loading ratios, these networks autonomously discover the inherent material behavior, without the limitations of predefined mathematical models. We regularize the model discovery using cross-sample feature selection and explore its sensitivity to the collagen fiber distribution. Strikingly, we uniformly discover an isotropic exponential first-invariant term and an anisotropic quadratic fifth-invariant term. We show that constitutive models with both these terms can reliably predict arterial responses under diverse loading conditions. Our results provide crucial improvements in experimental data agreement, and enhance our understanding into the biomechanical properties of pulmonary tissue. The model outcomes can be used in a variety of computational frameworks of autograft adaptation, ultimately improving the surgical outcomes after the Ross procedure. 
    more » « less
  3. Abstract Harnessing data to discover the underlying governing laws or equations that describe the behavior of complex physical systems can significantly advance our modeling, simulation and understanding of such systems in various science and engineering disciplines. This work introduces a novel approach called physics-informed neural network with sparse regression to discover governing partial differential equations from scarce and noisy data for nonlinear spatiotemporal systems. In particular, this discovery approach seamlessly integrates the strengths of deep neural networks for rich representation learning, physics embedding, automatic differentiation and sparse regression to approximate the solution of system variables, compute essential derivatives, as well as identify the key derivative terms and parameters that form the structure and explicit expression of the equations. The efficacy and robustness of this method are demonstrated, both numerically and experimentally, on discovering a variety of partial differential equation systems with different levels of data scarcity and noise accounting for different initial/boundary conditions. The resulting computational framework shows the potential for closed-form model discovery in practical applications where large and accurate datasets are intractable to capture. 
    more » « less
  4. Alber, Mark (Ed.)
    Biological systems exhibit complex dynamics that differential equations can often adeptly represent. Ordinary differential equation models are widespread; until recently their construction has required extensive prior knowledge of the system. Machine learning methods offer alternative means of model construction: differential equation models can be learnt from data via model discovery using sparse identification of nonlinear dynamics (SINDy). However, SINDy struggles with realistic levels of biological noise and is limited in its ability to incorporate prior knowledge of the system. We propose a data-driven framework for model discovery and model selection using hybrid dynamical systems: partial models containing missing terms. Neural networks are used to approximate the unknown dynamics of a system, enabling the denoising of the data while simultaneously learning the latent dynamics. Simulations from the fitted neural network are then used to infer models using sparse regression. We show, via model selection, that model discovery using hybrid dynamical systems outperforms alternative approaches. We find it possible to infer models correctly up to high levels of biological noise of different types. We demonstrate the potential to learn models from sparse, noisy data in application to a canonical cell state transition using data derived from single-cell transcriptomics. Overall, this approach provides a practical framework for model discovery in biology in cases where data are noisy and sparse, of particular utility when the underlying biological mechanisms are partially but incompletely known. 
    more » « less
  5. We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime, where the networks' biases are initialized to some constant rather than zero. We prove that under such initialization, the neural network will have sparse activation throughout the entire training process, which enables fast training procedures via some sophisticated computational methods. With such initialization, we show that the neural networks possess a different limiting kernel which we call bias-generalized NTK, and we study various properties of the neural networks with this new kernel. We first characterize the gradient descent dynamics. In particular, we show that the network in this case can achieve as fast convergence as the dense network, as opposed to the previous work suggesting that the sparse networks converge slower. In addition, our result improves the previous required width to ensure convergence. Secondly, we study the networks' generalization: we show a width-sparsity dependence, which yields a sparsity-dependent Rademacher complexity and generalization bound. To our knowledge, this is the first sparsity-dependent generalization result via Rademacher complexity. Lastly, we study the smallest eigenvalue of this new kernel. We identify a data-dependent region where we can derive a much sharper lower bound on the NTK's smallest eigenvalue than the worst-case bound previously known. This can lead to improvement in the generalization bound. 
    more » « less