skip to main content


Title: Representer Theorems in Banach Spaces: Minimum Norm Interpolation, Regularized Learning and Semi-Discrete Inverse Problems
Learning a function from a finite number of sampled data points (measurements) is a fundamental problem in science and engineering. This is often formulated as a minimum norm interpolation (MNI) problem, a regularized learning problem or, in general, a semi-discrete inverse problem (SDIP), in either Hilbert spaces or Banach spaces. The goal of this paper is to systematically study solutions of these problems in Banach spaces. We aim at obtaining explicit representer theorems for their solutions, on which convenient solution methods can then be developed. For the MNI problem, the explicit representer theorems enable us to express the infimum in terms of the norm of the linear combination of the interpolation functionals. For the purpose of developing efficient computational algorithms, we establish the fixed-point equation formulation of solutions of these problems. We reveal that unlike in a Hilbert space, in general, solutions of these problems in a Banach space may not be able to be reduced to truly finite dimensional problems (with certain infinite dimensional components hidden). We demonstrate how this obstacle can be removed, reducing the original problem to a truly finite dimensional one, in the special case when the Banach space is ℓ1(N).  more » « less
Award ID(s):
1912958
NSF-PAR ID:
10333653
Author(s) / Creator(s):
;
Editor(s):
Shawe-Taylor, John
Date Published:
Journal Name:
Journal of machine learning research
Volume:
22
Issue:
225
ISSN:
1532-4435
Page Range / eLocation ID:
1-65
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    From a unified vision of vector valued solutions in weighted Banach spaces, this paper establishes the existence and uniqueness for space homogeneous Boltzmann bi-linear systems with conservative collisional forms arising in complex gas dynamical structures. This broader vision is directly applied to dilute multi-component gas mixtures composed of both monatomic and polyatomic gases. Such models can be viewed as extensions of scalar Boltzmann binary elastic flows, as much as monatomic gas mixtures with disparate masses and single polyatomic gases, providing a unified approach for vector valued solutions in weighted Banach spaces. Novel aspects of this work include developing the extension of a general ODE theory in vector valued weighted Banach spaces, precise lower bounds for the collision frequency in terms of the weighted Banach norm, energy identities, angular or compact manifold averaging lemmas which provide coerciveness resulting into global in time stability, a new combinatorics estimate forp-binomial forms producing sharper estimates for thek-moments of bi-linear collisional forms. These techniques enable the Cauchy problem improvement that resolves the model with initial data corresponding to strictly positive and bounded initial vector valued mass and total energy, in addition to only a$$2^+$$2+moment determined by the hard potential rates discrepancy, a result comparable in generality to the classical Cauchy theory of the scalar homogeneous Boltzmann equation.

     
    more » « less
  2. null (Ed.)
    We investigate the approximability of the following optimization problem. The input is an n× n matrix A=(Aij) with real entries and an origin-symmetric convex body K⊂ ℝn that is given by a membership oracle. The task is to compute (or approximate) the maximum of the quadratic form ∑i=1n∑j=1n Aij xixj=⟨ x,Ax⟩ as x ranges over K. This is a rich and expressive family of optimization problems; for different choices of matrices A and convex bodies K it includes a diverse range of optimization problems like max-cut, Grothendieck/non-commutative Grothendieck inequalities, small set expansion and more. While the literature studied these special cases using case-specific reasoning, here we develop a general methodology for treatment of the approximability and inapproximability aspects of these questions. The underlying geometry of K plays a critical role; we show under commonly used complexity assumptions that polytime constant-approximability necessitates that K has type-2 constant that grows slowly with n. However, we show that even when the type-2 constant is bounded, this problem sometimes exhibits strong hardness of approximation. Thus, even within the realm of type-2 bodies, the approximability landscape is nuanced and subtle. However, the link that we establish between optimization and geometry of Banach spaces allows us to devise a generic algorithmic approach to the above problem. We associate to each convex body a new (higher dimensional) auxiliary set that is not convex, but is approximately convex when K has a bounded type-2 constant. If our auxiliary set has an approximate separation oracle, then we design an approximation algorithm for the original quadratic optimization problem, using an approximate version of the ellipsoid method. Even though our hardness result implies that such an oracle does not exist in general, this new question can be solved in specific cases of interest by implementing a range of classical tools from functional analysis, most notably the deep factorization theory of linear operators. Beyond encompassing the scenarios in the literature for which constant-factor approximation algorithms were found, our generic framework implies that that for convex sets with bounded type-2 constant, constant factor approximability is preserved under the following basic operations: (a) Subspaces, (b) Quotients, (c) Minkowski Sums, (d) Complex Interpolation. This yields a rich family of new examples where constant factor approximations are possible, which were beyond the reach of previous methods. We also show (under commonly used complexity assumptions) that for symmetric norms and unitarily invariant matrix norms the type-2 constant nearly characterizes the approximability of quadratic maximization. 
    more » « less
  3. Many applications of representation learning, such as privacy preservation, algorithmic fairness, and domain adaptation, desire explicit control over semantic information being discarded. This goal is formulated as satisfying two objectives: maximizing utility for predicting a target attribute while simultaneously being invariant (independent) to a known semantic attribute. Solutions to invariant representation learning (IRepL) problems lead to a trade-off between utility and invariance when they are competing. While existing works study bounds on this trade-off, two questions remain outstanding: 1) What is the exact trade-off between utility and invariance? and 2) What are the encoders (mapping the data to a representation) that achieve the trade-off, and how can we estimate it from training data? This paper addresses these questions for IRepLs in reproducing kernel Hilbert spaces (RKHS)s. Under the assumption that the distribution of a low-dimensional projection of high-dimensional data is approximately normal, we derive a closed-form solution for the global optima of the underlying optimization problem for encoders in RKHSs. This yields closed formulae for a near-optimal trade-off, corresponding optimal representation dimensionality, and the corresponding encoder(s). We also numerically quantify the trade-off on representative problems and compare them to those achieved by baseline IRepL algorithms. 
    more » « less
  4. The paper has two major themes. The first part of the paper establishes certain general results for infinite-dimensional optimization problems on Hilbert spaces. These results cover the classical representer theorem and many of its variants as special cases and offer a wider scope of applications. The second part of the paper then develops a systematic approach for learning the drift function of a stochastic differential equation by integrating the results of the first part with Bayesian hierarchical framework. Importantly, our Bayesian approach incorporates low-cost sparse learning through proper use of shrinkage priors while allowing proper quantification of uncertainty through posterior distributions. Several examples at the end illustrate the accuracy of our learning scheme. 
    more » « less
  5. In this paper, we propose an extension of the family of constructible dilating cones given by Kaliszewski (Quantitative Pareto analysis by cone separation technique, Kluwer Academic Publishers, Boston, 1994) from polyhedral pointed cones in finite-dimensional spaces to a general family of closed, convex, and pointed cones in infinite-dimensional spaces, which in particular covers all separable Banach spaces. We provide an explicit construction of the new family of dilating cones, focusing on sequence spaces and spaces of integrable functions equipped with their natural ordering cones. Finally, using the new dilating cones, we develop a conical regularization scheme for linearly constrained least-squares optimization problems. We present a numerical example to illustrate the efficacy of the proposed framework. 
    more » « less