Efficient real-time solvers for forward and inverse problems are essential in engineering and science applications. Machine learning surrogate models have emerged as promising alter- natives to traditional methods, offering substantially reduced computational time. Never- theless, these models typically demand extensive training datasets to achieve robust gen- eralization across diverse scenarios. While physics-based approaches can partially mitigate this data dependency and ensure physics-interpretable solutions, addressing scarce data regimes remains a challenge. Both purely data-driven and physics-based machine learning approaches demonstrate severe overfitting issues when trained with insufficient data. We propose a novel model-constrained Tikhonov autoencoder neural network framework, called TAEN, capable of learning both forward and inverse surrogate models using a single arbitrary observational sample. We develop comprehensive theoretical foundations including forward and inverse inference error bounds for the proposed approach for linear cases. For compara- tive analysis, we derive equivalent formulations for pure data-driven and model-constrained approach counterparts. At the heart of our approach is a data randomization strategy with theoretical justification, which functions as a generative mechanism for exploring the train- ing data space, enabling effective training of both forward and inverse surrogate models even with a single observation, while regularizing the learning process. We validate our approach through extensive numerical experiments on two challenging inverse problems: 2D heat conductivity inversion and initial condition reconstruction for time-dependent 2D Navier–Stokes equations. Results demonstrate that TAEN achieves accuracy comparable to traditional Tikhonov solvers and numerical forward solvers for both inverse and forward problems, respectively, while delivering orders of magnitude computational speedups. 
                        more » 
                        « less   
                    
                            
                            TNet: A Model-Constrained Tikhonov Network Approach for Inverse Problems
                        
                    
    
            Deep Learning (DL), in particular deep neural networks (DNN), by default is purely data-driven and in general does not require physics. This is the strength of DL but also one of its key limitations when applied to science and engineering problems in which underlying physical properties—such as stability, conservation, and positivity—and accuracy are required. DL methods in their original forms are not capable of respecting the underlying mathematical models or achieving desired accuracy even in big-data regimes. On the other hand, many data-driven science and engineering problems, such as inverse problems, typically have limited experimental or observational data, and DL would overfit the data in this case. Leveraging information encoded in the underlying mathematical models, we argue, not only compensates missing information in low data regimes but also provides opportunities to equip DL methods with the underlying physics, and hence promoting better generalization. This paper develops a model-constrained deep learning approach and its variant TNet—a Tikhonov neural network—that are capable of learning not only information hidden in the training data but also in the underlying mathematical models to solve inverse problems governed by partial differential equations in low data regimes. We provide the constructions and some theoretical results for the proposed approaches for both linear and nonlinear inverse problems. Since TNet is designed to learn inverse solution with Tikhonov regularization, it is interpretable: in fact it recovers Tikhonov solutions for linear cases while potentially approximating Tikhonov solutions in any desired accuracy for nonlinear inverse problems. We also prove that data randomization can enhance not only the smoothness of the networks but also their generalizations. Comprehensive numerical results confirm the theoretical findings and show that with even as little as 1 training data sample for 1D deconvolution, 5 for inverse 2D heat conductivity problem, 100 for inverse initial conditions for time-dependent 2D Burgers’ equation, and 50 for inverse initial conditions for 2D Navier-Stokes equations, TNet solutions can be as accurate as Tikhonov solutions while being several orders of magnitude faster. This is possible owing to the model-constrained term, replications, and randomization. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2212442
- PAR ID:
- 10537011
- Publisher / Repository:
- SIAM SISC
- Date Published:
- Journal Name:
- SIAM Journal on Scientific Computing
- Volume:
- 46
- Issue:
- 1
- ISSN:
- 1064-8275
- Page Range / eLocation ID:
- C77 to C100
- Subject(s) / Keyword(s):
- Inverse problem, randomization, model-constrained, deep learning, deep neural network, partial differential equations.
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            This paper presents a regularization framework that aims to improve the fidelity of Tikhonov inverse solutions. At the heart of the framework is the data-informed regularization idea that only data-uninformed parameters need to be regularized, while the data-informed parameters, on which data and forward model are integrated, should remain untouched. We propose to employ the active subspace method to determine the data-informativeness of a parameter. The resulting framework is thus called a data-informed (DI) active subspace (DIAS) regularization. Four proposed DIAS variants are rigorously analyzed, shown to be robust with the regularization parameter and capable of avoiding polluting solution features informed by the data. They are thus well suited for problems with small or reasonably small noise corruptions in the data. Furthermore, the DIAS approaches can effectively reuse any Tikhonov regularization codes/libraries. Though they are readily applicable for nonlinear inverse problems, we focus on linear problems in this paper in order to gain insights into the framework. Various numerical results for linear inverse problems are presented to verify theoretical findings and to demonstrate advantages of the DIAS framework over the Tikhonov, truncated SVD, and the TSVD-based DI approaches.more » « less
- 
            In this paper, we consider iterative methods based on sampling for computing solutions to separable nonlinear inverse problems where the entire dataset cannot be accessed or is not available all-at-once. In such scenarios (e.g., when massive amounts of data exceed memory capabilities or when data is being streamed), solving inverse problems, especially nonlinear ones, can be very challenging. We focus on separable nonlinear problems, where the objective function is nonlinear in one (typically small) set of parameters and linear in another (larger) set of parameters. For the linear problem, we describe a limited-memory sampled Tikhonov method, and for the nonlinear problem, we describe an approach to integrate the limited-memory sampled Tikhonov method within a nonlinear optimization framework. The proposed method is computationally efficient in that it only uses available data at any iteration to update both sets of parameters. Numerical experiments applied to massive super-resolution image reconstruction problems show the power of these methods.more » « less
- 
            Bayesian inference provides a systematic framework for integration of data with mathematical models to quantify the uncertainty in the solution of the inverse problem. However, the solution of Bayesian inverse problems governed by complex forward models described by partial differential equations (PDEs) remains prohibitive with black-box Markov chain Monte Carlo (MCMC) methods. We present hIPPYlib-MUQ, an extensible and scalable software framework that contains implementations of state-of-the art algorithms aimed to overcome the challenges of high-dimensional, PDE-constrained Bayesian inverse problems. These algorithms accelerate MCMC sampling by exploiting the geometry and intrinsic low-dimensionality of parameter space via derivative information and low rank approximation. The software integrates two complementary open-source software packages, hIPPYlib and MUQ. hIPPYlib solves PDE-constrained inverse problems using automatically-generated adjoint-based derivatives, but it lacks full Bayesian capabilities. MUQ provides a spectrum of powerful Bayesian inversion models and algorithms, but expects forward models to come equipped with gradients and Hessians to permit large-scale solution. By combining these two complementary libraries, we created a robust, scalable, and efficient software framework that realizes the benefits of each and allows us to tackle complex large-scale Bayesian inverse problems across a broad spectrum of scientific and engineering disciplines. To illustrate the capabilities of hIPPYlib-MUQ, we present a comparison of a number of MCMC methods available in the integrated software on several high-dimensional Bayesian inverse problems. These include problems characterized by both linear and nonlinear PDEs, various noise models, and different parameter dimensions. The results demonstrate that large (∼ 50×) speedups over conventional black box and gradient-based MCMC algorithms can be obtained by exploiting Hessian information (from the log-posterior), underscoring the power of the integrated hIPPYlib-MUQ framework.more » « less
- 
            Optical metamaterials manipulate light through various confinement and scattering processes, offering unique advantages like high performance, small form factor and easy integration with semiconductor devices. However, designing metasurfaces with suitable optical responses for complex metamaterial systems remains challenging due to the exponentially growing computation cost and the ill‐posed nature of inverse problems. To expedite the computation for the inverse design of metasurfaces, a physics‐informed deep learning (DL) framework is used. A tandem DL architecture with physics‐based learning is used to select designs that are scientifically consistent, have low error in design prediction, and accurate reconstruction of optical responses. The authors focus on the inverse design of a representative plasmonic device and consider the prediction of design for the optical response of a single wavelength incident or a spectrum of wavelength in the visible light range. The physics‐based constraint is derived from solving the electromagnetic wave equations for a simplified homogenized model. The model converges with an accuracy up to 97% for inverse design prediction with the optical response for the visible light spectrum as input, and up to 96% for optical response of single wavelength of light as input, with optical response reconstruction accuracy of 99%.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    