The optimal transport barycenter (a.k.a. Wasserstein barycenter) is a fundamental notion of averaging that extends from the Euclidean space to the Wasserstein space of probability distributions. Computation of the unregularized barycenter for discretized probability distributions on point clouds is a challenging task when the domain dimension d>1. Most practical algorithms for approximating the barycenter problem are based on entropic regularization. In this paper, we introduce a nearly linear time O(mlogm) and linear space complexity O(m) primal-dual algorithm, the Wasserstein-Descent ℍ˙1-Ascent (WDHA) algorithm, for computing the exact barycenter when the input probability density functions are discretized on an m-point grid. The key success of the WDHA algorithm hinges on alternating between two different yet closely related Wasserstein and Sobolev optimization geometries for the primal barycenter and dual Kantorovich potential subproblems. Under reasonable assumptions, we establish the convergence rate and iteration complexity of WDHA to its stationary point when the step size is appropriately chosen. Superior computational efficacy, scalability, and accuracy over the existing Sinkhorn-type algorithms are demonstrated on high-resolution (e.g., 1024×1024 images) 2D synthetic and real data.
more »
« less
A pre-expectation calculus for probabilistic sensitivity
Sensitivity properties describe how changes to the input of a program affect the output, typically by upper bounding the distance between the outputs of two runs by a monotone function of the distance between the corresponding inputs. When programs are probabilistic, the distance between outputs is a distance between distributions. The Kantorovich lifting provides a general way of defining a distance between distributions by lifting the distance of the underlying sample space; by choosing an appropriate distance on the base space, one can recover other usual probabilistic distances, such as the Total Variation distance. We develop a relational pre-expectation calculus to upper bound the Kantorovich distance between two executions of a probabilistic program. We illustrate our methods by proving algorithmic stability of a machine learning algorithm, convergence of a reinforcement learning algorithm, and fast mixing for card shuffling algorithms. We also consider some extensions: using our calculus to show convergence of Markov chains to the uniform distribution over states and an asynchronous extension to reason about pairs of program executions with different control flow.
more »
« less
- PAR ID:
- 10216894
- Date Published:
- Journal Name:
- Proceedings of the ACM on Programming Languages
- Volume:
- 5
- Issue:
- POPL
- ISSN:
- 2475-1421
- Page Range / eLocation ID:
- 1 to 28
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Wasserstein Barycenter is a principled approach to represent the weighted mean of a given set of probability distributions, utilizing the geometry induced by optimal transport. In this work, we present a novel scalable algorithm to approximate the Wasserstein Barycenters aiming at highdimensional applications in machine learning. Our proposed algorithm is based on the Kantorovich dual formulation of the Wasserstein-2 distance as well as a recent neural network architecture, input convex neural network, that is known to parametrize convex functions. The distinguishing features of our method are: i) it only requires samples from the marginal distributions; ii) unlike the existing approaches, it represents the Barycenter with a generative model and can thus generate infinite samples from the barycenter without querying the marginal distributions; iii) it works similar to Generative Adversarial Model in one marginal case. We demonstrate the efficacy of our algorithm by comparing it with the state-of-art methods in multiple experiments.more » « less
-
The classical static Schrödinger Bridge (SSB) problem, which seeks the most likely stochastic evolution between two marginal probability measures, has been studied extensively in the optimal transport and statistical physics communities, and more recently in machine learning communities in the surge of generative models. The standard approach to solve SSB is to first identify its Kantorovich dual and use Sinkhorn's algorithm to find the optimal potential functions. While the original SSB is only a strictly convex minimization problem, this approach is known to warrant linear convergence under mild assumptions. In this work, we consider a generalized SSB allowing any strictly increasing divergence functional, far generalizing the entropy functional x log(x) in the standard SSB. This problem naturally arises in a wide range of seemingly unrelated problems in entropic optimal transport, random graphs/matrices, and combinatorics. We establish Kantorovich duality and linear convergence of Sinkhorn's algorithm for the generalized SSB problem under mild conditions. Our results provide a new rigorous foundation for understanding Sinkhorn-type iterative methods in the context of large-scale generalized Schrödinger bridges.more » « less
-
This work is devoted to the study of rates of convergence of the empirical measures μn over a sample (Xk) of independent identically distributed real-valued random variables towards the common distribution μ in Kantorovich transport distances Wp. The focus is on finite range bounds on the expected Kantorovich distances E(Wp(μn, μ)) in terms of moments and analytic conditions on the measure μ and its distribution function. The study describes a variety of rates, from the standard one to slower rates, and both lower and upper-bounds on E(Wp(μn,μ)) for fixed n in various instances. Order statistics, reduction to uniform samples and analysis of beta distributions, inverse distribution functions, log-concavity are main tools in the investigation. Two detailed appendices collect classical and some new facts on inverse distribution functions and beta distributions and their densities necessary to the investigation.more » « less
-
Session types provide a formal type system to define and verify communication protocols between message-passing processes. In order to analyze randomized systems, recent works have extended session types with probabilistic type constructors. Unfortunately, all the proposed extensions only support constant probabilities which limits their applicability to real-world systems. Our work addresses this limitation by introducing probabilistic refinement session types which enable symbolic reasoning for concurrent probabilistic systems in a core calculus we call PReST. The type system is carefully designed to be a conservative extension of refinement session types and supports both probabilistic and regular choice type operators. We also implement PReST in a prototype which we use for validating probabilistic concurrent programs. The added expressive power leads to significant challenges, both in the meta theory and implementation of PReST, particularly with type checking: it requires reconstructing intermediate types for channels when type checking probabilistic branching expressions. The theory handles this by semantically quantifying refinement variables in probabilistic typing rules, a deviation from standard refinement type systems. The implementation relies on a bi-directional type checker that uses an SMT solver to reconstruct the intermediate types minimizing annotation overhead and increasing usability. To guarantee that probabilistic processes are almost-surely terminating, we integrate cost analysis into our type system to obtain expected upper bounds on recursion depth. We evaluate PReST on a wide variety of benchmarks from 4 categories: (i) randomized distributed protocols such as Itai and Rodeh's leader election, bounded retransmission, etc., (ii) parametric Markov chains such as random walks, (iii) probabilistic analysis of concurrent data structures such as queues, and (iv) distributions obtained by composing uniform distributions using operators like max, and sum. Our experiments show that the PReST type checker scales to large programs with sophisticated probabilistic distributions.more » « less
An official website of the United States government

