skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.
Attention:The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 7:00 AM ET to 7:30 AM ET on Friday, April 24 due to maintenance. We apologize for the inconvenience.


Title: Neural network approximation
Neural networks (NNs) are the method of choice for building learning algorithms. They are now being investigated for other numerical tasks such as solving high-dimensional partial differential equations. Their popularity stems from their empirical success on several challenging learning problems (computer chess/Go, autonomous navigation, face recognition). However, most scholars agree that a convincing theoretical explanation for this success is still lacking. Since these applications revolve around approximating an unknown function from data observations, part of the answer must involve the ability of NNs to produce accurate approximations. This article surveys the known approximation properties of the outputs of NNs with the aim of uncovering the properties that are not present in the more traditional methods of approximation used in numerical analysis, such as approximations using polynomials, wavelets, rational functions and splines. Comparisons are made with traditional approximation methods from the viewpoint of rate distortion, i.e. error versus the number of parameters used to create the approximant. Another major component in the analysis of numerical approximation is the computational time needed to construct the approximation, and this in turn is intimately connected with the stability of the approximation algorithm. So the stability of numerical approximation using NNs is a large part of the analysis put forward. The survey, for the most part, is concerned with NNs using the popular ReLU activation function. In this case the outputs of the NNs are piecewise linear functions on rather complicated partitions of the domain of f into cells that are convex polytopes. When the architecture of the NN is fixed and the parameters are allowed to vary, the set of output functions of the NN is a parametrized nonlinear manifold. It is shown that this manifold has certain space-filling properties leading to an increased ability to approximate (better rate distortion) but at the expense of numerical stability. The space filling creates the challenge to the numerical method of finding best or good parameter choices when trying to approximate.  more » « less
Award ID(s):
2045167
PAR ID:
10340139
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Acta Numerica
Volume:
30
ISSN:
0962-4929
Page Range / eLocation ID:
327 to 444
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Formal certification of Neural Networks (NNs) is crucial for ensuring their safety, fairness, and robustness. Unfortunately, on the one hand, sound and complete certification algorithms of ReLU-based NNs do not scale to large-scale NNs. On the other hand, incomplete certification algorithms are easier to compute, but they result in loose bounds that deteriorate with the depth of NN, which diminishes their effectiveness. In this paper, we ask the following question; can we replace the ReLU activation function with one that opens the door to incomplete certification algorithms that are easy to compute but can produce tight bounds on the NN's outputs? We introduce DeepBern-Nets, a class of NNs with activation functions based on Bernstein polynomials instead of the commonly used ReLU activation. Bernstein polynomials are smooth and differentiable functions with desirable properties such as the so-called range enclosure and subdivision properties. We design a novel Interval Bound Propagation (IBP) algorithm, called Bern-IBP, to efficiently compute tight bounds on DeepBern-Nets outputs. Our approach leverages the properties of Bernstein polynomials to improve the tractability of neural network certification tasks while maintaining the accuracy of the trained networks. We conduct experiments in adversarial robustness and reachability analysis settings to assess the effectiveness of the approach. Our proposed framework achieves high certified accuracy for adversarially-trained NNs, which is often a challenging task for certifiers of ReLU-based NNs. This work establishes Bernstein polynomial activation as a promising alternative for improving NN certification tasks across various NNs applications. 
    more » « less
  2. Recent studies suggest that deep neural networks (DNNs) have the potential to outperform neural networks (NNs) in approximating complex dynamics, which may enhance the tracking performance of a control system. However, unlike NN-based nonlinear control systems, designing update policies for the inner-layer weights of a DNN using Lyapunov-based stability methods is problematic since the inner-layer weights are nested within activation functions. Traditional DNN training methods (e.g., gradient descent) may improve the approximation capability of a DNN and thus could enhance a DNN-based controller’s tracking performance; however, traditional DNN-based control approaches lack stability guarantees and may be ineffective in training the DNN without large data sets, which could hinder the tracking performance. In this work, a DNN-based control structure is developed for a hybrid exoskeleton, which combines a rehabilitative robot with functional electrical stimulation (FES). The proposed control system updates the DNN weights in real-time and a rigorous Lyapunov-based stability analysis is performed to ensure semi-global asymptotic trajectory tracking, even without the presence of a data set. Specifically, a Lyapunov-based update law is developed for the output-layer DNN weights and Lyapunov-based constraints are established for the adaptation laws of the inner-layer DNN weights. Additionally, the DNN-based FES controller was designed to be saturated to increase the comfort and safety of the participant. 
    more » « less
  3. This paper is focused on the output tracking control problem of a wave equation with both matched and unmatched boundary uncertainties. An adaptive boundary feedback control scheme is proposed by utilizing radial basis function neural networks (RBF NNs) to deal with the effect of system uncertainties. Specifically, two RBF NN models are first developed to approximate the matched and unmatched system uncertain dynamics respectively. Based on this, an adaptive NN control scheme is derived, which consists of: (i) an adaptive boundary feedback controller embedded by the NN model approximating the matched uncertainty, for rendering stable and accurate tracking control; and (ii) a reference model embedded by the NN model approximating the unmatched uncertainty, for generating a prescribed reference trajectory. Rigorous analysis is performed using the Lyapunov theory and the C0-semigroup theory to prove that our proposed control scheme can guarantee closed-loop stability and wellposedness. Simulation study has been conducted to demonstrate effectiveness of the proposed approach. 
    more » « less
  4. Abstract We develop a general theory of flows in the space of Riemannian metrics induced by neural network (NN) gradient descent. This is motivated in part by recent advances in approximating Calabi–Yau metrics with NNs and is enabled by recent advances in understanding flows in the space of NNs. We derive the corresponding metric flow equations, which are governed by a metric neural tangent kernel (NTK), a complicated, non-local object that evolves in time. However, many architectures admit an infinite-width limit in which the kernel becomes fixed and the dynamics simplify. Additional assumptions can induce locality in the flow, which allows for the realization of Perelman’s formulation of Ricci flow that was used to resolve the 3d Poincaré conjecture. We demonstrate that such fixed kernel regimes lead to poor learning of numerical Calabi–Yau metrics, as is expected since the associated NNs do not learn features. Conversely, we demonstrate that well-learned numerical metrics at finite-width exhibit an evolving metric-NTK, associated with feature learning. Our theory of NN metric flows therefore explains why NNs are better at learning Calabi–Yau metrics than fixed kernel methods, such as the Ricci flow. 
    more » « less
  5. Overparameterized neural networks enjoy great representation power on complex data, and more importantly yield sufficiently smooth output, which is crucial to their generalization and robustness. Most existing function approximation theories suggest that with sufficiently many parameters, neural networks can well approximate certain classes of functions in terms of the function value. The neural network themselves, however, can be highly nonsmooth. To bridge this gap, we take convolutional residual networks (ConvResNets) as an example, and prove that large ConvResNets can not only approximate a target function in terms of function value, but also exhibit sufficient first-order smoothness. Moreover, we extend our theory to approximating functions supported on a low-dimensional manifold. Our theory partially justifies the benefits of using deep and wide networks in practice. Numerical experiments on adversarial robust image classification are provided to support our theory. 
    more » « less