skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Operator Learning Using Random Features: A Tool for Scientific Computing
Supervised operator learning centers on the use of training data, in the form of input-output pairs, to estimate maps between infinite-dimensional spaces. It is emerging as apowerful tool to complement traditional scientific computing, which may often be framedin terms of operators mapping between spaces of functions. Building on the classical ran-dom features methodology for scalar regression, this paper introduces the function-valuedrandom features method. This leads to a supervised operator learning architecture thatis practical for nonlinear problems yet is structured enough to facilitate efficient trainingthrough the optimization of a convex, quadratic cost. Due to the quadratic structure, thetrained model is equipped with convergence guarantees and error and complexity bounds,properties that are not readily available for most other operator learning architectures. Atits core, the proposed approach builds a linear combination of random operators. Thisturns out to be a low-rank approximation of an operator-valued kernel ridge regression al-gorithm, and hence the method also has strong connections to Gaussian process regression.The paper designs function-valued random features that are tailored to the structure oftwo nonlinear operator learning benchmark problems arising from parametric partial differ-ential equations. Numerical results demonstrate the scalability, discretization invariance,and transferability of the function-valued random features method.  more » « less
Award ID(s):
1835860
PAR ID:
10548358
Author(s) / Creator(s):
;
Publisher / Repository:
Society for Industrial and Applied Mathematics
Date Published:
Journal Name:
SIAM Review
Volume:
66
Issue:
3
ISSN:
0036-1445
Page Range / eLocation ID:
535-571
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This work concerns the local convergence theory of Newton and quasi-Newton methods for convex-composite optimization: where one minimizes an objective that can be written as the composition of a convex function with one that is continuiously differentiable. We focus on the case in which the convex function is a potentially infinite-valued piecewise linear-quadratic function. Such problems include nonlinear programming, mini-max optimization, and estimation of nonlinear dynamics with non-Gaussian noise as well as many modern approaches to large-scale data analysis and machine learning. Our approach embeds the optimality conditions for convex-composite optimization problems into a generalized equation. We establish conditions for strong metric subregularity and strong metric regularity of the corresponding set-valued mappings. This allows us to extend classical convergence of Newton and quasi-Newton methods to the broader class of nonfinite valued piecewise linear-quadratic convex-composite optimization problems. In particular, we establish local quadratic convergence of the Newton method under conditions that parallel those in nonlinear programming. 
    more » « less
  2. We present Basis-to-Basis (B2B) operator learning, a novel approach for learning operators on Hilbert spaces of functions based on the foundational ideas of function encoders. We decompose the task of learning operators into two parts: learning sets of basis functions for both the input and output spaces and learning a potentially nonlinear mapping between the coefficients of the basis functions. B2B operator learning circumvents many challenges of prior works, such as requiring data to be at fixed locations, by leveraging classic techniques such as least squares to compute the coefficients. It is especially potent for linear operators, where we compute a mapping between bases as a single matrix transformation with a closed-form solution. Furthermore, with minimal modifications and using the deep theoretical connections between function encoders and functional analysis, we derive operator learning algorithms that are directly analogous to eigen-decomposition and singular value decomposition. We empirically validate B2B operator learning on seven benchmark operator learning tasks and show that it demonstrates a two-orders-of-magnitude improvement in accuracy over existing approaches on several benchmark tasks. 
    more » « less
  3. The paper is devoted to establishing relationships between global and local monotonicity, as well as their maximality versions, for single-valued and set-valued mappings between fnite-dimensional and infnite-dimensional spaces. We frst show that for single-valued operators with convex domains in locally convex topological spaces, their continuity ensures that their global monotonicity agrees with the local one around any point of the graph. This also holds for set-valued mappings defned on the real line under a certain connectedness condition. The situation is diferent for set-valued operators in multidimensional spaces as demonstrated by an example of locally monotone operator on the plane that is not globally monotone. Finally, we invoke coderivative criteria from variational analysis to characterize both global and local maximal monotonicity of set-valued operators in Hilbert spaces to verify the equivalence between these monotonicity properties under the closedgraph and global hypomonotonicity assumptions. 
    more » « less
  4. null (Ed.)
    Vector space models for symbolic processing that encode symbols by random vectors have been proposed in cognitive science and connectionist communities under the names Vector Symbolic Architecture (VSA), and, synonymously, Hyperdimensional (HD) computing. In this paper, we generalize VSAs to function spaces by mapping continuous-valued data into a vector space such that the inner product between the representations of any two data points represents a similarity kernel. By analogy to VSA, we call this new function encoding and computing framework Vector Function Architecture (VFA). In VFAs, vectors can represent individual data points as well as elements of a function space (a reproducing kernel Hilbert space). The algebraic vector operations, inherited from VSA, correspond to well-defined operations in function space. Furthermore, we study a previously proposed method for encoding continuous data, fractional power encoding (FPE), which uses exponentiation of a random base vector to produce randomized representations of data points and fulfills the kernel properties for inducing a VFA. We show that the distribution from which elements of the base vector are sampled determines the shape of the FPE kernel, which in turn induces a VFA for computing with band-limited functions. In particular, VFAs provide an algebraic framework for implementing large-scale kernel machines with random features, extending Rahimi and Recht, 2007. Finally, we demonstrate several applications of VFA models to problems in image recognition, density estimation and nonlinear regression. Our analyses and results suggest that VFAs constitute a powerful new framework for representing and manipulating functions in distributed neural systems, with myriad applications in artificial intelligence. 
    more » « less
  5. null (Ed.)
    We establish the convergence of the forward-backward splitting algorithm based on Bregman distances for the sum of two monotone operators in reflexive Banach spaces. Even in Euclidean spaces, the convergence of this algorithm has so far been proved only in the case of minimization problems. The proposed framework features Bregman distances that vary over the iterations and a novel assumption on the single-valued operator that captures various properties scattered in the literature. In the minimization setting, we obtain rates that are sharper than existing ones. 
    more » « less