skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Learning Algorithms for Coarsening Uncertainty Space and Applications to Multiscale Simulations
In this paper, we investigate and design multiscale simulations for stochastic multiscale PDEs. As for the space, we consider a coarse grid and a known multiscale method, the generalized multiscale finite element method (GMsFEM). In order to obtain a small dimensional representation of the solution in each coarse block, the uncertainty space needs to be partitioned (coarsened). This coarsenining collects realizations that provide similar multiscale features as outlined in GMsFEM (or other method of choice). This step is known to be computationally demanding as it requires many local solves and clustering based on them. In this work, we take a different approach and learn coarsening the uncertainty space. Our methods use deep learning techniques in identifying clusters (coarsening) in the uncertainty space. We use convolutional neural networks combined with some techniques in adversary neural networks. We define appropriate loss functions in the proposed neural networks, where the loss function is composed of several parts that includes terms related to clusters and reconstruction of basis functions. We present numerical results for channelized permeability fields in the examples of flows in porous media.  more » « less
Award ID(s):
1934904
PAR ID:
10177086
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Mathematics
Volume:
8
Issue:
5
ISSN:
2227-7390
Page Range / eLocation ID:
720
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We studied the use of deep neural networks (DNNs) in the numerical solution of the oscillatory Fredholm integral equation of the second kind. It is known that the solution of the equation exhibits certain oscillatory behaviors due to the oscillation of the kernel. It was pointed out recently that standard DNNs favor low frequency functions, and as a result, they often produce poor approximation for functions containing high frequency components. We addressed this issue in this study. We first developed a numerical method for solving the equation with DNNs as an approximate solution by designing a numerical quadrature that tailors to computing oscillatory integrals involving DNNs. We proved that the error of the DNN approximate solution of the equation is bounded by the training loss and the quadrature error. We then proposed a multigrade deep learning (MGDL) model to overcome the spectral bias issue of neural networks. Numerical experiments demonstrate that the MGDL model is effective in extracting multiscale information of the oscillatory solution and overcoming the spectral bias issue from which a standard DNN model suffers. 
    more » « less
  2. Parameter-space regularization in neural network optimization is a fundamental tool for improving generalization. However, standard parameter-space regularization methods make it challenging to encode explicit preferences about desired predictive functions into neural network training. In this work, we approach regularization in neural networks from a probabilistic perspective and show that by viewing parameter-space regularization as specifying an empirical prior distribution over the model parameters, we can derive a probabilistically well-motivated regularization technique that allows explicitly encoding information about desired predictive functions into neural network training. This method—which we refer to as function-space empirical Bayes (FS-EB)—includes both parameter- and function-space regularization, is mathematically simple, easy to implement, and incurs only minimal computational overhead compared to standard regularization techniques. We evaluate the utility of this regularization technique empirically and demonstrate that the proposed method leads to near-perfect semantic shift detection, highly-calibrated predictive uncertainty estimates, successful task adaption from pre-trained models, and improved generalization under covariate shift. 
    more » « less
  3. Recent efforts in deploying Deep Neural Networks for object detection in real world applications, such as autonomous driving, assume that all relevant object classes have been observed during training. Quantifying the performance of these models in settings when the test data is not represented in the training set has mostly focused on pixel-level uncertainty estimation techniques of models trained for semantic segmentation. This paper proposes to exploit additional predictions of semantic segmentation models and quantifying its confidences, followed by classification of object hypotheses as known vs. unknown, out of distribution objects. We use object proposals generated by Region Proposal Network (RPN) and adapt distance aware uncertainty estimation of semantic segmentation using Radial Basis Functions Networks (RBFN) for class agnostic object mask prediction. The augmented object proposals are then used to train a classifier for known vs. unknown objects categories. Experimental results demonstrate that the proposed method achieves parallel performance to state of the art methods for unknown object detection and can also be used effectively for reducing object detectors' false positive rate. Our method is well suited for applications where prediction of non-object background categories obtained by semantic segmentation is reliable. 
    more » « less
  4. Summary This paper presents a method for determining the relevant buses for reduced models of power grid networks described by systems of differential‐algebraic equations and for constructing the coarse‐grain dynamical power grid systems. To determine these buses, time integration of differential equations is not needed, but rather, a stationary system is analyzed. However, unlike stationary‐system approaches that determine only coarse generator buses by approximating the coherency of the generators, the proposed method analyzes the graph Laplacian associated with the admittance matrix. The buses for the reduced model are chosen to ensure that the graph Laplacian of the reduced model is an accurate approximation to the graph Laplacian of the full system. Both load and generator buses can be selected by this procedure since the Laplacian is defined on all the buses. The basis of this proposed approach lies in the close relationship between the synchrony of the system and the spectral properties of this Laplacian, that is, conditions on the spectrum of this Laplacian that almost surely guarantee the synchrony of the system. Thus, assuming that the full system is in synchrony, our strategy is to coarsen the full‐system Laplacian such that the coarse Laplacian possesses good approximation to these spectral conditions. Accurate approximation to these conditions then can better lead to synchronous reduced models. The coarsened Laplacian is defined on coarse degrees of freedom (DOFs), which are associated with the relevant buses to include in the reduced model. To realize this coarse DOF selection, we use multigrid coarsening techniques based on compatible relaxation. Multigrid is the natural choice since it has been extensively used to coarsen Laplacians arising from discretizations of elliptic partial differential equations and is actively being extended to graph Laplacians. With the selection of the buses for the reduced model, the reduced model is completed by constructing the coarse admittance matrix values and other physical parameters using standard power grid techniques or by using the intergrid operators constructed in the coarse DOFs selection process. Unfortunately, the selection of the coarse buses and the coarsening of the admittance matrix and physical parameters are not sufficient by themselves to produce a stable reduced system. To achieve a stable system, system structures of the fine‐grain model must be preserved in the reduced model. We analyze this to develop a multigrid methodology for constructing stable reduced models of power grid systems. Numerical examples are presented to validate this methodology. 
    more » « less
  5. A wide variety of activation functions have been proposed for neural networks. The Rectified Linear Unit (ReLU) is especially popular today. There are many practical reasons that motivate the use of the ReLU. This paper provides new theoretical characterizations that support the use of the ReLU, its variants such as the leaky ReLU, as well as other activation functions in the case of univariate, single-hidden layer feedforward neural networks. Our results also explain the importance of commonly used strategies in the design and training of neural networks such as “weight decay” and “path-norm” regularization, and provide a new justification for the use of “skip connections” in network architectures. These new insights are obtained through the lens of spline theory. In particular, we show how neural network training problems are related to infinite-dimensional optimizations posed over Banach spaces of functions whose solutions are well-known to be fractional and polynomial splines, where the particular Banach space (which controls the order of the spline) depends on the choice of activation function. 
    more » « less