skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Representer Theorems in Banach Spaces: Minimum Norm Interpolation, Regularized Learning and Semi-Discrete Inverse Problems
Learning a function from a finite number of sampled data points (measurements) is a fundamental problem in science and engineering. This is often formulated as a minimum norm interpolation (MNI) problem, a regularized learning problem or, in general, a semi-discrete inverse problem (SDIP), in either Hilbert spaces or Banach spaces. The goal of this paper is to systematically study solutions of these problems in Banach spaces. We aim at obtaining explicit representer theorems for their solutions, on which convenient solution methods can then be developed. For the MNI problem, the explicit representer theorems enable us to express the infimum in terms of the norm of the linear combination of the interpolation functionals. For the purpose of developing efficient computational algorithms, we establish the fixed-point equation formulation of solutions of these problems. We reveal that unlike in a Hilbert space, in general, solutions of these problems in a Banach space may not be able to be reduced to truly finite dimensional problems (with certain infinite dimensional components hidden). We demonstrate how this obstacle can be removed, reducing the original problem to a truly finite dimensional one, in the special case when the Banach space is ℓ1(N).  more » « less
Award ID(s):
1912958
PAR ID:
10333653
Author(s) / Creator(s):
;
Editor(s):
Shawe-Taylor, John
Date Published:
Journal Name:
Journal of machine learning research
Volume:
22
Issue:
225
ISSN:
1532-4435
Page Range / eLocation ID:
1-65
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Sparsity of a learning solution is a desirable feature in machine learning. Certain reproducing kernel Banach spaces (RKBSs) are appropriate hypothesis spaces for sparse learning methods. The goal of this paper is to understand what kind of RKBSs can promote sparsity for learning solutions. We consider two typical learning models in an RKBS: the minimum norm interpolation (MNI) problem and the regularization problem. We first establish an explicit representer theorem for solutions of these problems, which represents the extreme points of the solution set by a linear combination of the extreme points of the subdifferential set, of the norm function, which is data-dependent. We then propose sufficient conditions on the RKBS that can transform the explicit representation of the solutions to a sparse kernel representation having fewer terms than the number of the observed data. Under the proposed sufficient conditions, we investigate the role of the regularization parameter on sparsity of the regularized solutions. We further show that two specific RKBSs, the sequence space l_1(N) and the measure space, can have sparse representer theorems for both MNI and regularization models. 
    more » « less
  2. We develop a variational framework to understand the properties of the functions learned by neural networks fit to data. We propose and study a family of continuous-domain linear inverse problems with total variation-like regularization in the Radon domain subject to data fitting constraints. We derive a representer theorem showing that finite-width, singlehidden layer neural networks are solutions to these inverse problems. We draw on many techniques from variational spline theory and so we propose the notion of polynomial ridge splines, which correspond to single-hidden layer neural networks with truncated power functions as the activation function. The representer theorem is reminiscent of the classical reproducing kernel Hilbert space representer theorem, but we show that the neural network problem is posed over a non-Hilbertian Banach space. While the learning problems are posed in the continuous-domain, similar to kernel methods, the problems can be recast as finite-dimensional neural network training problems. These neural network training problems have regularizers which are related to the well-known weight decay and path-norm regularizers. Thus, our result gives insight into functional characteristics of trained neural networks and also into the design neural network regularizers. We also show that these regularizers promote neural network solutions with desirable generalization properties. Keywords: neural networks, splines, inverse problems, regularization, sparsity 
    more » « less
  3. This paper introduces a novel theoretical framework for the analysis of vector-valued neu- ral networks through the development of vector-valued variation spaces, a new class of reproducing kernel Banach spaces. These spaces emerge from studying the regularization e↵ect of weight decay in training networks with activation functions like the rectified linear unit (ReLU). This framework o↵ers a deeper understanding of multi-output networks and their function-space characteristics. A key contribution of this work is the development of a representer theorem for the vector-valued variation spaces. This representer theorem estab- lishes that shallow vector-valued neural networks are the solutions to data-fitting problems over these infinite-dimensional spaces, where the network widths are bounded by the square of the number of training data. This observation reveals that the norm associated with these vector-valued variation spaces encourages the learning of features that are useful for multiple tasks, shedding new light on multi-task learning with neural networks. Finally, this paper develops a connection between weight-decay regularization and the multi-task lasso problem. This connection leads to novel bounds for layer widths in deep networks that depend on the intrinsic dimensions of the training data representations. This insight not only deepens the understanding of the deep network architectural requirements, but also yields a simple convex optimization method for deep neural network compression. The performance of this compression procedure is evaluated on various architectures. 
    more » « less
  4. Regularized learning problems in Banach spaces, which often minimize the sum of a data fidelity term in one Banach norm and a regularization term in another Banach norm, is challenging to solve. We construct a direct sum space based on the Banach spaces for the fidelity term and the regularization term and recast the objective function as the norm of a quotient space of the direct sum space. We then express the original regularized problem as an optimization problem in the dual space of the direct sum space. It is to find the maximum of a linear function on a convex polytope, which may be solved by linear programming. A solution of the original problem is then obtained by using related extremal properties of norming functionals from a solution of the dual problem. Numerical experiments demonstrate that the proposed duality approach is effective for solving the regularization learning problems. 
    more » « less
  5. We consider the top Lyapunov exponent associated to a dissipative linear evolution equation posed on a separable Hilbert or Banach space. In many applications in partial differential equations, such equations are often posed on a scale of nonequivalent spaces mitigating, e.g., integrability (L^p) or differentiability (W^{s,p}). In contrast to finite dimensions, the Lyapunov exponent could a priori depend on the choice of norm used. In this paper we show that under quite general conditions, the Lyapunov exponent of a cocycle of compact linear operators is independent of the norm used. We apply this result to two important problems from fluid mechanics: the enhanced dissipation rate for the advection diffusion equation with ergodic velocity field; and the Lyapunov exponent for the 2d Navier–Stokes equations with stochastic or periodic forcing. 
    more » « less