Nowadays, the data collected in physical/engineering systems allows various machine learning methods to conduct system monitoring and control, when the physical knowledge on the system edge is limited and challenging to recover completely. Solving such problems typically requires identifying forward system mapping rules, from system states to the output measurements. However, the forward system identification based on digital twin can hardly provide complete monitoring functions, such as state estimation, e.g., to infer the states from measurements. While one can directly learn the inverse mapping rule, it is more desirable to re-utilize the forward digital twin since it is relatively easy to embed physical law there to regularize the inverse process and avoid overfitting. For this purpose, this paper proposes an invertible learning structure based on designing parallel paths in structural neural networks with basis functionals and embedding virtual storage variables for information preservation. For such a two-way digital twin modeling, there is an additional challenge of multiple solutions for system inverse, which contradict the reality of one feasible solution for the current system. To avoid ambiguous inverse, the proposed model maximizes the physical likelihood to contract the original solution space, leading to the unique system operation status of interest. We validate the proposed method on various physical system monitoring tasks and scenarios, such as inverse kinematics problems, power system state estimation, etc. Furthermore, by building a perfect match of a forward-inverse pair, the proposed method obtains accurate and computation-efficient inverse predictions, given observations. Finally, the forward physical interpretation and small prediction errors guarantee the explainability of the invertible structure, compared to standard learning methods.
more »
« less
Inverse Problems, Deep Learning, and Symmetry Breaking
In many physical systems, inputs related by intrinsic system symmetries are mapped to the same output. When inverting such physical systems, i.e., solving the associated inverse problems, there is no unique solution. This causes fundamental difficulty in deploying the emerging end-to-end deep learning approach. Using the generalized phase retrieval problem as an illustrative example, we show that careful symmetry breaking on training data can help remove the difficulty and significantly improve the learning performance. We also extract and highlight the underlying mathematical principle of the proposed solution, which is directly applicable to other inverse problems. A full-length version of this paper can be found at https://arxiv.org/abs/2003.09077.
more »
« less
- Award ID(s):
- 1838159
- PAR ID:
- 10198731
- Date Published:
- Journal Name:
- ICML workshop on ML Interpretability for Scientific Discovery
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Despite the recent popularity of attention-based neural architectures in core AI fields like natural language processing (NLP) and computer vision (CV), their potential in modeling complex physical systems remains underexplored. Learning problems in physical systems are often characterized as discovering operators that map between function spaces based on a few instances of function pairs. This task frequently presents a severely ill-posed PDE inverse problem. In this work, we propose a novel neural operator architecture based on the attention mechanism, which we refer to as the Nonlocal Attention Operator (NAO), and explore its capability in developing a foundation physical model. In particular, we show that the attention mechanism is equivalent to a double integral operator that enables nonlocal interactions among spatial tokens, with a data-dependent kernel characterizing the inverse mapping from data to the hidden parameter field of the underlying operator. As such, the attention mechanism extracts global prior information from training data generated by multiple systems, and suggests the exploratory space in the form of a nonlinear kernel map. Consequently, NAO can address ill-posedness and rank deficiency in inverse PDE problems by encoding regularization and achieving generalizability. We empirically demonstrate the advantages of NAO over baseline neural models in terms of generalizability to unseen data resolutions and system states. Our work not only suggests a novel neural operator architecture for learning interpretable foundation models of physical systems, but also offers a new perspective towards understanding the attention mechanism. Our code and data accompanying this paper are available at https://github.com/fishmoon1234/NAO.more » « less
-
Learning is traditionally studied in biological or computational systems. The power of learning frameworks in solving hard inverse problems provides an appealing case for the development of physical learning in which physical systems adopt desirable properties on their own without computational design. It was recently realized that large classes of physical systems can physically learn through local learning rules, autonomously adapting their parameters in response to observed examples of use. We review recent work in the emerging field of physical learning, describing theoretical and experimental advances in areas ranging from molecular self-assembly to flow networks and mechanical materials. Physical learning machines provide multiple practical advantages over computer designed ones, in particular by not requiring an accurate model of the system, and their ability to autonomously adapt to changing needs over time. As theoretical constructs, physical learning machines afford a novel perspective on how physical constraints modify abstract learning theory.more » « less
-
Plug-and-Play Priors (PnP) and Regularization by Denoising (RED) are widely- used frameworks for solving imaging inverse problems by computing fixed-points of operators combining physical measurement models and learned image priors. While traditional PnP/RED formulations have focused on priors specified using image denoisers, there is a growing interest in learning PnP/RED priors that are end-to-end optimal. The recent Deep Equilibrium Models (DEQ) framework has enabled memory-efficient end-to-end learning of PnP/RED priors by implicitly differentiating through the fixed-point equations without storing intermediate activation values. However, the dependence of the computational/memory complexity of the measurement models in PnP/RED on the total number of measurements leaves DEQ impractical for many imaging applications. We propose ODER as a new strategy for improving the efficiency of DEQ through stochastic approximations of the measurement models. We theoretically analyze ODER giving insights into its ability to approximate the traditional DEQ approach for solving inverse problems. Our numerical results suggest the potential improvements in training/testing complexity due to ODER on three distinct imaging applications.more » « less
-
Deep Learning (DL), in particular deep neural networks (DNN), by default is purely data-driven and in general does not require physics. This is the strength of DL but also one of its key limitations when applied to science and engineering problems in which underlying physical properties—such as stability, conservation, and positivity—and accuracy are required. DL methods in their original forms are not capable of respecting the underlying mathematical models or achieving desired accuracy even in big-data regimes. On the other hand, many data-driven science and engineering problems, such as inverse problems, typically have limited experimental or observational data, and DL would overfit the data in this case. Leveraging information encoded in the underlying mathematical models, we argue, not only compensates missing information in low data regimes but also provides opportunities to equip DL methods with the underlying physics, and hence promoting better generalization. This paper develops a model-constrained deep learning approach and its variant TNet—a Tikhonov neural network—that are capable of learning not only information hidden in the training data but also in the underlying mathematical models to solve inverse problems governed by partial differential equations in low data regimes. We provide the constructions and some theoretical results for the proposed approaches for both linear and nonlinear inverse problems. Since TNet is designed to learn inverse solution with Tikhonov regularization, it is interpretable: in fact it recovers Tikhonov solutions for linear cases while potentially approximating Tikhonov solutions in any desired accuracy for nonlinear inverse problems. We also prove that data randomization can enhance not only the smoothness of the networks but also their generalizations. Comprehensive numerical results confirm the theoretical findings and show that with even as little as 1 training data sample for 1D deconvolution, 5 for inverse 2D heat conductivity problem, 100 for inverse initial conditions for time-dependent 2D Burgers’ equation, and 50 for inverse initial conditions for 2D Navier-Stokes equations, TNet solutions can be as accurate as Tikhonov solutions while being several orders of magnitude faster. This is possible owing to the model-constrained term, replications, and randomization.more » « less
An official website of the United States government

