skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Linear optimal transport embedding: provable Wasserstein classification for certain rigid transformations and perturbations
Abstract Discriminating between distributions is an important problem in a number of scientific fields. This motivated the introduction of Linear Optimal Transportation (LOT), which embeds the space of distributions into an $L^2$-space. The transform is defined by computing the optimal transport of each distribution to a fixed reference distribution and has a number of benefits when it comes to speed of computation and to determining classification boundaries. In this paper, we characterize a number of settings in which LOT embeds families of distributions into a space in which they are linearly separable. This is true in arbitrary dimension, and for families of distributions generated through perturbations of shifts and scalings of a fixed distribution. We also prove conditions under which the $L^2$ distance of the LOT embedding between two distributions in arbitrary dimension is nearly isometric to Wasserstein-2 distance between those distributions. This is of significant computational benefit, as one must only compute $$N$$ optimal transport maps to define the $N^2$ pairwise distances between $$N$$ distributions. We demonstrate the benefits of LOT on a number of distribution classification problems.  more » « less
Award ID(s):
2111322 1819222 2012266
PAR ID:
10359677
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Information and Inference: A Journal of the IMA
ISSN:
2049-8772
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract In this paper we study supervised learning tasks on the space of probability measures. We approach this problem by embedding the space of probability measures into$$L^2$$ L 2 spaces using the optimal transport framework. In the embedding spaces, regular machine learning techniques are used to achieve linear separability. This idea has proved successful in applications and when the classes to be separated are generated by shifts and scalings of a fixed measure. This paper extends the class of elementary transformations suitable for the framework to families of shearings, describing conditions under which two classes of sheared distributions can be linearly separated. We furthermore give necessary bounds on the transformations to achieve a pre-specified separation level, and show how multiple embeddings can be used to allow for larger families of transformations. We demonstrate our results on image classification tasks. 
    more » « less
  2. Summary Sequential Monte Carlo algorithms are widely accepted as powerful computational tools for making inference with dynamical systems. A key step in sequential Monte Carlo is resampling, which plays the role of steering the algorithm towards the future dynamics. Several strategies have been used in practice, including multinomial resampling, residual resampling, optimal resampling, stratified resampling and optimal transport resampling. In one-dimensional cases, we show that optimal transport resampling is equivalent to stratified resampling on the sorted particles, and both strategies minimize the resampling variance as well as the expected squared energy distance between the original and resampled empirical distributions. For general $$d$$-dimensional cases, we show that if the particles are first sorted using the Hilbert curve, the variance of stratified resampling is $$O(m^{-(1+2/d)})$$, an improvement over the best previously known rate of $$O(m^{-(1+1/d)})$$, where $$m$$ is the number of resampled particles. We show that this improved rate is optimal for ordered stratified resampling schemes, as conjectured in Gerber et al. (2019). We also present an almost-sure bound on the Wasserstein distance between the original and Hilbert-curve-resampled empirical distributions. In light of these results, we show that for dimension $d>1$ the mean square error of sequential quasi-Monte Carlo with $$n$$ particles can be $$O(n^{-1-4/\{d(d+4)\}})$$ if Hilbert curve resampling is used and a specific low-discrepancy set is chosen. To our knowledge, this is the first known convergence rate lower than $$o(n^{-1})$$. 
    more » « less
  3. Wasserstein distances form a family of metrics on spaces of probability measures that have recently seen many applications. However, statistical analysis in these spaces is complex due to the nonlinearity of Wasserstein spaces. One potential solution to this problem is Linear Optimal Transport (LOT). This method allows one to find a Euclidean embedding, called {\it LOT embedding}, of measures in some Wasserstein spaces, but some information is lost in this embedding. So, to understand whether statistical analysis relying on LOT embeddings can make valid inferences about original data, it is helpful to quantify how well these embeddings describe that data. To answer this question, we present a decomposition of the {\it Fr\'echet variance} of a set of measures in the 2-Wasserstein space, which allows one to compute the percentage of variance explained by LOT embeddings of those measures. We then extend this decomposition to the Fused Gromov-Wasserstein setting. We also present several experiments that explore the relationship between the dimension of the LOT embedding, the percentage of variance explained by the embedding, and the classification accuracy of machine learning classifiers built on the embedded data. We use the MNIST handwritten digits dataset, IMDB-50000 dataset, and Diffusion Tensor MRI images for these experiments. Our results illustrate the effectiveness of low dimensional LOT embeddings in terms of the percentage of variance explained and the classification accuracy of models built on the embedded data. 
    more » « less
  4. A quasiconformal tree is a doubling metric tree in which the diameter of each arc is bounded above by a fixed multiple of the distance between its endpoints. In this paper we show that every quasiconformal tree bi-Lipschitz embeds in some Euclidean space, with the ambient dimension and the bi-Lipschitz constant depending only on the doubling and bounded turning constants of the tree. This answers Question 1.6 of David and Vellis [Illinois J. Math. 66 (2022), pp. 189–244]. 
    more » « less
  5. A<sc>bstract</sc> By quantifying the distance between two collider events, one can triangulate a metric space and reframe collider data analysis as computational geometry. One popular geometric approach is to first represent events as an energy flow on an idealized celestial sphere and then define the metric in terms of optimal transport in two dimensions. In this paper, we advocate for representing events in terms of a spectral function that encodes pairwise particle angles and products of particle energies, which enables a metric distance defined in terms of one-dimensional optimal transport. This approach has the advantage of automatically incorporating obvious isometries of the data, like rotations about the colliding beam axis. It also facilitates first-principles calculations, since there are simple closed-form expressions for optimal transport in one dimension. Up to isometries and event sets of measure zero, the spectral representation is unique, so the metric on the space of spectral functions is a metric on the space of events. At lowest order in perturbation theory in electron-positron collisions, our metric is simply the summed squared invariant masses of the two event hemispheres. Going to higher orders, we present predictions for the distribution of metric distances between jets in fixed-order and resummed perturbation theory as well as in parton-shower generators. Finally, we speculate on whether the spectral approach could furnish a useful metric on the space of quantum field theories. 
    more » « less