skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Adaptive optimal transport
Abstract An adaptive, adversarial methodology is developed for the optimal transport problem between two distributions $$\mu $$ and $$\nu $$, known only through a finite set of independent samples $$(x_i)_{i=1..n}$$ and $$(y_j)_{j=1..m}$$. The methodology automatically creates features that adapt to the data, thus avoiding reliance on a priori knowledge of the distributions underlying the data. Specifically, instead of a discrete point-by-point assignment, the new procedure seeks an optimal map $T(x)$ defined for all $$x$$, minimizing the Kullback–Leibler divergence between $$(T(x_i))$$ and the target $$(y_j)$$. The relative entropy is given a sample-based, variational characterization, thereby creating an adversarial setting: as one player seeks to push forward one distribution to the other, the second player develops features that focus on those areas where the two distributions fail to match. The procedure solves local problems that seek the optimal transfer between consecutive, intermediate distributions between $$\mu $$ and $$\nu $$. As a result, maps of arbitrary complexity can be built by composing the simple maps used for each local problem. Displaced interpolation is used to guarantee global from local optimality. The procedure is illustrated through synthetic examples in one and two dimensions.  more » « less
Award ID(s):
1715753
PAR ID:
10102111
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Information and Inference: A Journal of the IMA
Volume:
8
Issue:
4
ISSN:
2049-8772
Page Range / eLocation ID:
p. 789-816
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract Optimal transport maps and plans between two absolutely continuous measures $$\mu$$ and $$\nu$$ can be approximated by solving semidiscrete or fully discrete optimal transport problems. These two problems ensue from approximating $$\mu$$ or both $$\mu$$ and $$\nu$$ by Dirac measures. Extending an idea from Gigli (2011, On Hölder continuity-in-time of the optimal transport map towards measures along a curve. Proc. Edinb. Math. Soc. (2), 54, 401–409), we characterize how transport plans change under the perturbation of both $$\mu$$ and $$\nu$$. We apply this insight to prove error estimates for semidiscrete and fully discrete algorithms in terms of errors solely arising from approximating measures. We obtain weighted $L^2$ error estimates for both types of algorithms with a convergence rate $$O(h^{1/2})$$. This coincides with the rate in Theorem 5.4 in Berman (2018, Convergence rates for discretized Monge–Ampère equations and quantitative stability of optimal transport. Preprint available at arXiv:1803.00785) for semidiscrete methods, but the error notion is different. 
    more » « less
  2. Braverman, Mark (Ed.)
    We present a framework for speeding up the time it takes to sample from discrete distributions $$\mu$$ defined over subsets of size $$k$$ of a ground set of $$n$$ elements, in the regime where $$k$$ is much smaller than $$n$$. We show that if one has access to estimates of marginals $$\mathbb{P}_{S\sim \mu}[i\in S]$$, then the task of sampling from $$\mu$$ can be reduced to sampling from related distributions $$\nu$$ supported on size $$k$$ subsets of a ground set of only $$n^{1-\alpha}\cdot \operatorname{poly}(k)$$ elements. Here, $$1/\alpha\in [1, k]$$ is the parameter of entropic independence for $$\mu$$. Further, our algorithm only requires sparsified distributions $$\nu$$ that are obtained by applying a sparse (mostly $$0$$) external field to $$\mu$$, an operation that for many distributions $$\mu$$ of interest, retains algorithmic tractability of sampling from $$\nu$$. This phenomenon, which we dub domain sparsification, allows us to pay a one-time cost of estimating the marginals of $$\mu$$, and in return reduce the amortized cost needed to produce many samples from the distribution $$\mu$$, as is often needed in upstream tasks such as counting and inference. For a wide range of distributions where $$\alpha=\Omega(1)$$, our result reduces the domain size, and as a corollary, the cost-per-sample, by a $$\operatorname{poly}(n)$$ factor. Examples include monomers in a monomer-dimer system, non-symmetric determinantal point processes, and partition-constrained Strongly Rayleigh measures. Our work significantly extends the reach of prior work of Anari and Derezi\'nski who obtained domain sparsification for distributions with a log-concave generating polynomial (corresponding to $$\alpha=1$$). As a corollary of our new analysis techniques, we also obtain a less stringent requirement on the accuracy of marginal estimates even for the case of log-concave polynomials; roughly speaking, we show that constant-factor approximation is enough for domain sparsification, improving over $O(1/k)$ relative error established in prior work. 
    more » « less
  3. We consider solutions of the repulsive Vlasov–Poisson system which are a combination of a point charge and a small gas, i.e., measures of the form\delta_{(\mathcal{X}(t),\mathcal{V}(t))}+\mu^{2}d\mathbf{x}d\mathbf{v}for some(\mathcal{X}, \mathcal{V})\colon \mathbb{R}\to\mathbb{R}^{6}and a small gas distribution\mu\colon \mathbb{R}\to L^{2}_{\mathbf{x},\mathbf{v}}, and study asymptotic dynamics in the associated initial value problem. If initially suitable moments on\mu_{0}=\mu(t=0)are small, we obtain a global solution of the above form, and the electric field generated by the gas distribution \mudecays at an almost optimal rate. Assuming in addition boundedness of suitable derivatives of \mu_{0}, the electric field decays at an optimal rate, and we derive modified scattering dynamics for the motion of the point charge and the gas distribution. Our proof makes crucial use of the Hamiltonian structure. The linearized system is transport by the Kepler ODE, which we integrate exactly through an asymptotic action-angle transformation. Thanks to a precise understanding of the associated kinematics, moment and derivative control is achieved via a bootstrap analysis that relies on the decay of the electric field associated to\mu. The asymptotic behavior can then be deduced from the properties of Poisson brackets in asymptotic action coordinates. 
    more » « less
  4. Abstract Isotonic regression is a standard problem in shape-constrained estimation where the goal is to estimate an unknown non-decreasing regression function $$f$$ from independent pairs $$(x_i, y_i)$$ where $${\mathbb{E}}[y_i]=f(x_i), i=1, \ldots n$$. While this problem is well understood both statistically and computationally, much less is known about its uncoupled counterpart, where one is given only the unordered sets $$\{x_1, \ldots , x_n\}$$ and $$\{y_1, \ldots , y_n\}$$. In this work, we leverage tools from optimal transport theory to derive minimax rates under weak moments conditions on $$y_i$$ and to give an efficient algorithm achieving optimal rates. Both upper and lower bounds employ moment-matching arguments that are also pertinent to learning mixtures of distributions and deconvolution. 
    more » « less
  5. Abstract Consider $$(X_{i}(t))$$ ( X i ( t ) ) solving a system of N stochastic differential equations interacting through a random matrix $${\mathbf {J}} = (J_{ij})$$ J = ( J ij ) with independent (not necessarily identically distributed) random coefficients. We show that the trajectories of averaged observables of $$(X_i(t))$$ ( X i ( t ) ) , initialized from some $$\mu $$ μ independent of  $${\mathbf {J}}$$ J , are universal, i.e., only depend on the choice of the distribution $$\mathbf {J}$$ J through its first and second moments (assuming e.g., sub-exponential tails). We take a general combinatorial approach to proving universality for dynamical systems with random coefficients, combining a stochastic Taylor expansion with a moment matching-type argument. Concrete settings for which our results imply universality include aging in the spherical SK spin glass, and Langevin dynamics and gradient flows for symmetric and asymmetric Hopfield networks. 
    more » « less