skip to main content

Search for: All records

Award ID contains: 2212442

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    One of the reasons why many neural networks are capable of replicating complicated tasks or functions is their universal approximation property. Though the past few decades have seen tremendous advances in theories of neural networks, a single constructive and elementary framework for neural network universality remains unavailable. This paper is an effort to provide a unified and constructive framework for the universality of a large class of activation functions including most of the existing ones. At the heart of the framework is the concept of neural network approximate identity (nAI). The main result is as follows: any nAI activation function is universal in the space of continuous functions on compacta. It turns out that most of the existing activation functions are nAI, and thus universal. The framework induces several advantages over the contemporary counterparts. First, it is constructive with elementary means from functional analysis, probability theory, and numerical analysis. Second, it is one of the first unified and constructive attempts that is valid for most of the existing activation functions. Third, it provides new proofs for most activation functions. Fourth, for a given activation and error tolerance, the framework provides precisely the architecture of the corresponding one-hidden neural network with a predetermined number of neurons and the values of weights/biases. Fifth, the framework allows us to abstractly present the first universal approximation with a favorable non-asymptotic rate. Sixth, our framework also provides insights into the developments, and hence providing constructive derivations, of some of the existing approaches.

    more » « less
  2. N/A (Ed.)

    Partial differential equation (PDE)-constrained inverse problems are some of the most challenging and computationally demanding problems in computational science today. Fine meshes required to accurately compute the PDE solution introduce an enormous number of parameters and require large-scale computing resources such as more processors and more memory to solve such systems in a reasonable time. For inverse problems constrained by time-dependent PDEs, the adjoint method often employed to compute gradients and higher order derivatives efficiently requires solving a time-reversed, so-called adjoint PDE that depends on the forward PDE solution at each timestep. This necessitates the storage of a high-dimensional forward solution vector at every timestep. Such a procedure quickly exhausts the available memory resources. Several approaches that trade additional computation for reduced memory footprint have been proposed to mitigate the memory bottleneck, including checkpointing and compression strategies. In this work, we propose a close-to-ideal scalable compression approach using autoencoders to eliminate the need for checkpointing and substantial memory storage, thereby reducing the time-to-solution and memory requirements. We compare our approach with checkpointing and an off-the-shelf compression approach on an earth-scale ill-posed seismic inverse problem. The results verify the expected close-to-ideal speedup for the gradient and Hessian-vector product using the proposed autoencoder compression approach. To highlight the usefulness of the proposed approach, we combine the autoencoder compression with the data-informed active subspace (DIAS) prior showing how the DIAS method can be affordably extended to large-scale problems without the need for checkpointing and large memory.

    more » « less
    Free, publicly-accessible full text available October 12, 2024
  3. N/A (Ed.)

    This work unifies the analysis of various randomized methods for solving linear and nonlinear inverse problems with Gaussian priors by framing the problem in a stochastic optimization setting. By doing so, we show that many randomized methods are variants of a sample average approximation (SAA). More importantly, we are able to prove a single theoretical result that guarantees the asymptotic convergence for a variety of randomized methods. Additionally, viewing randomized methods as an SAA enables us to prove, for the first time, a single non-asymptotic error result that holds for randomized methods under consideration. Another important consequence of our unified framework is that it allows us to discover new randomization methods. We present various numerical results for linear, nonlinear, algebraic, and PDE-constrained inverse problems that verify the theoretical convergence results and provide a discussion on the apparently different convergence rates and the behavior for various randomized methods.

    more » « less