Machine learning methods are increasingly being employed as surrogate models in place of computationally expensive and slow numerical integrators for a bevy of applications in the natural sciences. However, while the laws of physics are relationships between scalars, vectors and tensors that hold regardless of the frame of reference or chosen coordinate system, surrogate machine learning models are not coordinate-free by default. We enforce coordinate freedom by using geometric convolutions in three model architectures: a ResNet, a Dilated ResNet and a UNet. In numerical experiments emulating two-dimensional compressible Navier–Stokes, we see better accuracy and improved stability compared with baseline surrogate models in almost all cases. The ease of enforcing coordinate freedom without making major changes to the model architecture provides an exciting recipe for any convolutional neural network-based method applied to an appropriate class of problems.
more »
« less
This content will become publicly available on December 31, 2025
Robust Emulator for Compressible Navier-Stokes using Equivariant Geometric Convolutions
Recent methods to simulate complex fluid dynamics problems have replaced computationally expensive and slow numerical integrators with surrogate models learned from data. However, while the laws of physics are relationships between scalars, vectors, and tensors that hold regardless of the frame of reference or chosen coordinate system, surrogate machine learning models are not coordinate-free by default. We enforce coordinate freedom by using geometric convolutions in three model architectures: a ResNet, a Dilated ResNet, and a UNet. In numerical experiments emulating 2D compressible Navier-Stokes, we see better accuracy and improved stability compared to baseline surrogate models in almost all cases. The ease of enforcing coordinate freedom without making major changes to the model architecture provides an exciting recipe for any CNN-based method applied on an appropriate class of problems.
more »
« less
- Award ID(s):
- 2339682
- PAR ID:
- 10609705
- Publisher / Repository:
- NeurIPS workshop machine learning for the physical sciences
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Efficient real-time solvers for forward and inverse problems are essential in engineering and science applications. Machine learning surrogate models have emerged as promising alter- natives to traditional methods, offering substantially reduced computational time. Never- theless, these models typically demand extensive training datasets to achieve robust gen- eralization across diverse scenarios. While physics-based approaches can partially mitigate this data dependency and ensure physics-interpretable solutions, addressing scarce data regimes remains a challenge. Both purely data-driven and physics-based machine learning approaches demonstrate severe overfitting issues when trained with insufficient data. We propose a novel model-constrained Tikhonov autoencoder neural network framework, called TAEN, capable of learning both forward and inverse surrogate models using a single arbitrary observational sample. We develop comprehensive theoretical foundations including forward and inverse inference error bounds for the proposed approach for linear cases. For compara- tive analysis, we derive equivalent formulations for pure data-driven and model-constrained approach counterparts. At the heart of our approach is a data randomization strategy with theoretical justification, which functions as a generative mechanism for exploring the train- ing data space, enabling effective training of both forward and inverse surrogate models even with a single observation, while regularizing the learning process. We validate our approach through extensive numerical experiments on two challenging inverse problems: 2D heat conductivity inversion and initial condition reconstruction for time-dependent 2D Navier–Stokes equations. Results demonstrate that TAEN achieves accuracy comparable to traditional Tikhonov solvers and numerical forward solvers for both inverse and forward problems, respectively, while delivering orders of magnitude computational speedups.more » « less
-
Abstract We introduce the concept of decision‐focused surrogate modeling for solving computationally challenging nonlinear optimization problems in real‐time settings. The proposed data‐driven framework seeks to learn a simpler, for example, convex, surrogate optimization model that is trained to minimize thedecision prediction error, which is defined as the difference between the optimal solutions of the original and the surrogate optimization models. The learning problem, formulated as a bilevel program, can be viewed as a data‐driven inverse optimization problem to which we apply a decomposition‐based solution algorithm from previous work. We validate our framework through numerical experiments involving the optimization of common nonlinear chemical processes such as chemical reactors, heat exchanger networks, and material blending systems. We also present a detailed comparison of decision‐focused surrogate modeling with standard data‐driven surrogate modeling methods and demonstrate that our approach is significantly more data‐efficient while producing simple surrogate models with high decision prediction accuracy.more » « less
-
Recent works have shown that imposing tensor structures on the coefficient tensor in regression problems can lead to more reliable parameter estimation and lower sample complexity compared to vector-based methods. This work investigates a new low-rank tensor model, called Low Separation Rank (LSR), in Generalized Linear Model (GLM) problems. The LSR model – which generalizes the well-known Tucker and CANDECOMP/PARAFAC (CP) models, and is a special case of the Block Tensor Decomposition (BTD) model – is imposed onto the coefficient tensor in the GLM model. This work proposes a block coordinate descent algorithm for parameter estimation in LSR-structured tensor GLMs. Most importantly, it derives a minimax lower bound on the error threshold on estimating the coefficient tensor in LSR tensor GLM problems. The minimax bound is proportional to the intrinsic degrees of freedom in the LSR tensor GLM problem, suggesting that its sample complexity may be significantly lower than that of vectorized GLMs. This result can also be specialised to lower bound the estimation error in CP and Tucker-structured GLMs. The derived bounds are comparable to tight bounds in the literature for Tucker linear regression, and the tightness of the minimax lower bound is further assessed numerically. Finally, numerical experiments on synthetic datasets demonstrate the efficacy of the proposed LSR tensor model for three regression types (linear, logistic and Poisson). Experiments on a collection of medical imaging datasets demonstrate the usefulness of the LSR model over other tensor models (Tucker and CP) on real, imbalanced data with limited available samples. License: Creative Commons Attribution 4.0 International (CC BY 4.0)more » « less
-
Real-time accurate solutions of large-scale complex dynamical systems are critically needed for control, optimization, uncertainty quantification, and decision-making in practical engineering and science applications, particularly in digital twin contexts. Recent research on hybrid approaches combining numerical methods and machine learning in end-to-end training has shown significant improvements over either approach alone. However, using neural networks as surrogate models generally exhibits limitations in generalizability over different settings and in capturing the evolution of solution discontinuities. In this work, we develop a model-constrained discontinuous Galerkin Network DGNet approach, a significant extension to our previous work, for compressible Euler equations with out-of-distribution generalization. The core of DGNet is the synergy of several key strategies: (i) leveraging time integration schemes to capture temporal correlation and taking advantage of neural network speed for computation time reduction. This is the key to the temporal discretization-invariant property of DGNet; (ii) employing a model-constrained approach to ensure the learned tangent slope satisfies governing equations; (iii) utilizing a DG-inspired architecture for GNN where edges represent Riemann solver surrogate models and nodes represent volume integration correction surrogate models, enabling capturing discontinuity capability, aliasing error reduction, and mesh discretization generalizability. Such a design allows DGNet to learn the DG spatial discretization accurately; (iv) developing an input normalization strategy that allows surrogate models to generalize across different initial conditions, geometries, meshes, boundary conditions, and solution orders. In fact, the normalization is the key to spatial discretization-invariance for DGNet; and (v) incorporating a data randomization technique that not only implicitly promotes agreement between surrogate models and true numerical models up to second-order derivatives, ensuring long-term stability and prediction capacity, but also serves as a data generation engine during training, leading to enhanced generalization on unseen data. To validate the theoretical results, effectiveness, stability, and generalizability of our novel DGNet approach, we present comprehensive numerical results for 1D and 2D compressible Euler equation problems, including Sod Shock Tube, Lax Shock Tube, Isentropic Vortex, Forward Facing Step, Scramjet, Airfoil, Euler Benchmarks, Double Mach Reflection, and a Hypersonic Sphere Cone benchmark.more » « less
An official website of the United States government
