skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Least-Squares Neural Network (LSNN) Method for Linear Advection-Reaction Equation: Discontinuity Interface
We studied the least-squares ReLU neural network (LSNN) method for solving a linear advection-reaction equation with discontinuous solution in [Z. Cai et al., J. Comput. Phys., 443 (2021), 110514]. The method is based on a least-squares formulation and uses a new class of approximating functions: ReLU neural network (NN) functions. A critical and additional component of the LSNN method, differing from other NN-based methods, is the introduction of a properly designed and physics preserved discrete differential operator. In this paper, we study the LSNN method for problems with discontinuity interfaces. First, we show that ReLU NN functions with depth \(\lceil \log\_2(d+1)\rceil+1\) can approximate any \(d\)-dimensional step function on a discontinuity interface generated by a vector field as streamlines with any prescribed accuracy. By decomposing the solution into continuous and discontinuous parts, we prove theoretically that the discretization error of the LSNN method using ReLU NN functions with depth \(\lceil \log\_2(d+1)\rceil+1\) is mainly determined by the continuous part of the solution provided that the solution jump is constant. Numerical results for both two- and three-dimensional test problems with various discontinuity interfaces show that the LSNN method with enough layers is accurate and does not exhibit the common Gibbs phenomena along discontinuity interfaces.  more » « less
Award ID(s):
2110571
PAR ID:
10612363
Author(s) / Creator(s):
; ;
Publisher / Repository:
the Society for Industrial and Applied Mathematics (SIAM)
Date Published:
Journal Name:
SIAM Journal on Scientific Computing
Volume:
46
Issue:
4
ISSN:
1064-8275
Page Range / eLocation ID:
C448 to C478
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper, we propose a new deep learning method for the nonlinear Poisson-Boltzmann problems with applications in computational biology. To tackle the discontinuity of the solution, e.g., across protein surfaces, we approximate the solution by a piecewise mesh-free neural network that can capture the dramatic change in the solution across the interface. The partial differential equation problem is first reformulated as a least-squares physics-informed neural network (PINN)-type problem and then discretized to an objective function using mean squared error via sampling. The solution is obtained by minimizing the designed objective function via standard training algorithms such as the stochastic gradient descent method. Finally, the effectiveness and efficiency of the neural network are validated using complex protein interfaces on various manufactured functions with different frequencies. 
    more » « less
  2. Existing parallel algorithms for wavelet tree construction have a work complexity of $$O(n\log\sigma)$$. This paper presents parallel algorithms for the problem with improved work complexity. Our first algorithm is based on parallel integer sorting and has either $$O(n\log\log n\lceil\log\sigma/\sqrt{\log n\log\log n}\rceil)$$ work and polylogarithmic depth, or $$O(n\lceil\log\sigma/\sqrt{\log n}\rceil)$$ work and sub-linear depth. We also describe another algorithm that has $$O(n\lceil\log\sigma/\sqrt{\log n} \rceil)$$ work and $$O(\sigma+\log n)$$ depth. We then show how to use similar ideas to construct variants of wavelet trees (arbitrary-shaped binary trees and multiary trees) as well as wavelet matrices in parallel with lower work complexity than prior algorithms. Finally, we show that the rank and select structures on binary sequences and multiary sequences, which are stored on wavelet tree nodes, can be constructed in parallel with improved work bounds, matching those of the best existing sequential algorithms for constructing rank and select structures. 
    more » « less
  3. Abstract A conjecture of Kalai asserts that for $$d\geq 4$$, the affine type of a prime simplicial $$d$$-polytope $$P$$ can be reconstructed from the space of affine $$2$$-stresses of $$P$$. We prove this conjecture for all $$d\geq 5$$. We also prove the following generalization: for all pairs $(i,d)$ with $$2\leq i\leq \lceil \frac d 2\rceil -1$$, the affine type of a simplicial $$d$$-polytope $$P$$ that has no missing faces of dimension $$\geq d-i+1$$ can be reconstructed from the space of affine $$i$$-stresses of $$P$$. A consequence of our proofs is a strengthening of the Generalized Lower Bound Theorem: it was proved by Nagel that for any simplicial $(d-1)$-sphere $$\Delta $$ and $$1\leq k\leq \lceil \frac {d}{2}\rceil -1$$, $$g_{k}(\Delta )$$ is at least as large as the number of missing $(d-k)$-faces of $$\Delta $$; here we show that, for $$1\leq k\leq \lfloor \frac {d}{2}\rfloor -1$$, equality holds if and only if $$\Delta $$ is $$k$$-stacked. Finally, we show that for $$d\geq 4$$, any simplicial $$d$$-polytope $$P$$ that has no missing faces of dimension $$\geq d-1$$ is redundantly rigid, that is, for each edge $$e$$ of $$P$$, there exists an affine $$2$$-stress on $$P$$ with a non-zero value on $$e$$. 
    more » « less
  4. Ruslan Salakhutdinov (Ed.)
    This paper develops simple feed-forward neural networks that achieve the universal approximation property for all continuous functions with a fixed finite number of neurons. These neural networks are simple because they are designed with a simple and computable continuous activation function $$\sigma$$ leveraging a triangular-wave function and a softsign function. We prove that $$\sigma$$-activated networks with width $36d(2d+1)$ and depth $11$ can approximate any continuous function on a $$d$$-dimensional hypercube within an arbitrarily small error. Hence, for supervised learning and its related regression problems, the hypothesis space generated by these networks with a size not smaller than $$36d(2d+1)\times 11$$ is dense in the space of continuous functions. Furthermore, classification functions arising from image and signal classification are in the hypothesis space generated by $$\sigma$$-activated networks with width $36d(2d+1)$ and depth $12$, when there exist pairwise disjoint closed bounded subsets of $$\mathbb{R}^d$$ such that the samples of the same class are located in the same subset. 
    more » « less
  5. We study the theory of neural network (NN) from the lens of classical nonparametric regression problems with a focus on NN's ability to adaptively estimate functions with heterogeneous smoothness -- a property of functions in Besov or Bounded Variation (BV) classes. Existing work on this problem requires tuning the NN architecture based on the function spaces and sample sizes. We consider a "Parallel NN" variant of deep ReLU networks and show that the standard weight decay is equivalent to promoting the ℓp-sparsity (0<1) of the coefficient vector of an end-to-end learned function bases, i.e., a dictionary. Using this equivalence, we further establish that by tuning only the weight decay, such Parallel NN achieves an estimation error arbitrarily close to the minimax rates for both the Besov and BV classes. Notably, it gets exponentially closer to minimax optimal as the NN gets deeper. Our research sheds new lights on why depth matters and how NNs are more powerful than kernel methods. 
    more » « less