Quantitative stability and error estimates for optimal transport plans
Abstract Optimal transport maps and plans between two absolutely continuous measures $\mu$ and $\nu$ can be approximated by solving semidiscrete or fully discrete optimal transport problems. These two problems ensue from approximating $\mu$ or both $\mu$ and $\nu$ by Dirac measures. Extending an idea from Gigli (2011, On Hölder continuity-in-time of the optimal transport map towards measures along a curve. Proc. Edinb. Math. Soc. (2), 54, 401–409), we characterize how transport plans change under the perturbation of both $\mu$ and $\nu$. We apply this insight to prove error estimates for semidiscrete and fully discrete algorithms in terms of errors solely arising from approximating measures. We obtain weighted $L^2$ error estimates for both types of algorithms with a convergence rate $O(h^{1/2})$. This coincides with the rate in Theorem 5.4 in Berman (2018, Convergence rates for discretized Monge–Ampère equations and quantitative stability of optimal transport. Preprint available at arXiv:1803.00785) for semidiscrete methods, but the error notion is different.
Authors:
;
Award ID(s):
Publication Date:
NSF-PAR ID:
10299281
Journal Name:
IMA Journal of Numerical Analysis
Volume:
41
Issue:
3
Page Range or eLocation-ID:
1941 to 1965
ISSN:
0272-4979
We present a framework for speeding up the time it takes to sample from discrete distributions $\mu$ defined over subsets of size $k$ of a ground set of $n$ elements, in the regime where $k$ is much smaller than $n$. We show that if one has access to estimates of marginals $\mathbb{P}_{S\sim \mu}[i\in S]$, then the task of sampling from $\mu$ can be reduced to sampling from related distributions $\nu$ supported on size $k$ subsets of a ground set of only $n^{1-\alpha}\cdot \operatorname{poly}(k)$ elements. Here, $1/\alpha\in [1, k]$ is the parameter of entropic independence for $\mu$. Further, our algorithm only requires sparsified distributions $\nu$ that are obtained by applying a sparse (mostly $0$) external field to $\mu$, an operation that for many distributions $\mu$ of interest, retains algorithmic tractability of sampling from $\nu$. This phenomenon, which we dub domain sparsification, allows us to pay a one-time cost of estimating the marginals of $\mu$, and in return reduce the amortized cost needed to produce many samples from the distribution $\mu$, as is often needed in upstream tasks such as counting and inference. For a wide range of distributions where $\alpha=\Omega(1)$, our result reduces the domain size, and as a corollary, themore »
An adaptive, adversarial methodology is developed for the optimal transport problem between two distributions $\mu$ and $\nu$, known only through a finite set of independent samples $(x_i)_{i=1..n}$ and $(y_j)_{j=1..m}$. The methodology automatically creates features that adapt to the data, thus avoiding reliance on a priori knowledge of the distributions underlying the data. Specifically, instead of a discrete point-by-point assignment, the new procedure seeks an optimal map $T(x)$ defined for all $x$, minimizing the Kullback–Leibler divergence between $(T(x_i))$ and the target $(y_j)$. The relative entropy is given a sample-based, variational characterization, thereby creating an adversarial setting: as one player seeks to push forward one distribution to the other, the second player develops features that focus on those areas where the two distributions fail to match. The procedure solves local problems that seek the optimal transfer between consecutive, intermediate distributions between $\mu$ and $\nu$. As a result, maps of arbitrary complexity can be built by composing the simple maps used for each local problem. Displaced interpolation is used to guarantee global from local optimality. The procedure is illustrated through synthetic examples in one and two dimensions.