Wasserstein GANs are increasingly used in Computer Vision applications as they are easier to train. Previous WGAN variants mainly use the l1 transport cost to compute the Wasserstein distance between the real and synthetic data distributions. The l1 transport cost restricts the discriminator to be 1-Lipschitz. However, WGANs with l1 transport cost were recently shown to not always converge. In this paper, we propose WGAN-QC, a WGAN with quadratic transport cost. Based on the quadratic transport cost, we propose an Optimal Transport Regularizer (OTR) to stabilize the training process of WGAN-QC. We prove that the objective of the discriminator during each generator update computes the exact quadratic Wasserstein distance between real and synthetic data distributions. We also prove that WGAN-QC converges to a local equilibrium point with finite discriminator updates per generator update. We show experimentally on a Dirac distribution that WGAN-QC converges, when many of the l1 cost WGANs fail to [22]. Qualitative and quantitative results on the CelebA, CelebA-HQ, LSUN and the ImageNet dog datasets show that WGAN-QC is better than state-of-art GAN methods. WGAN-QC has much faster runtime than other WGAN variants.
more »
« less
EFFICIENT DISCRETE MULTI-MARGINAL OPTIMAL TRANSPORT REGULARIZATION
Optimal transport has emerged as a powerful tool for a variety of problems in machine learning, and it is frequently used to enforce distributional constraints. In this context, existing methods often use either a Wasserstein metric, or else they apply concurrent barycenter approaches when more than two distributions are considered. In this paper, we leverage multi-marginal optimal transport (MMOT), where we take advantage of a procedure that computes a generalized earth mover’s distance as a sub-routine. We show that not only is our algorithm computationally more efficient compared to other barycentric-based distance methods, but it has the additional advantage that gradients used for backpropagation can be efficiently computed during the forward pass computation itself, which leads to substantially faster model training. We provide technical details about this new regularization term and its properties, and we present experimental demonstrations of faster runtimes when compared to standard Wasserstein-style methods. Finally, on a range of experiments designed to assess effectiveness at enforcing fairness, we demonstrate our method compares well with alternatives.
more »
« less
- Award ID(s):
- 1918211
- PAR ID:
- 10467900
- Publisher / Repository:
- OpenReview.net
- Date Published:
- Format(s):
- Medium: X
- Location:
- International Conference on Learning Representations
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Remote sensing observations from satellites and global biogeochemical models have combined to revolutionize the study of ocean biogeochemical cycling, but comparing the two data streams to each other and across time remains challenging due to the strong spatial-temporal structuring of the ocean. Here, we show that the Wasserstein distance provides a powerful metric for harnessing these structured datasets for better marine ecosystem and climate predictions. The Wasserstein distance complements commonly used point-wise difference methods such as the root-mean-squared error, by quantifying differences in terms of spatial displacement in addition to magnitude. As a test case, we consider chlorophyll (a key indicator of phytoplankton biomass) in the northeast Pacific Ocean, obtained from model simulations, in situ measurements, and satellite observations. We focus on two main applications: (i) comparing model predictions with satellite observations, and (ii) temporal evolution of chlorophyll both seasonally and over longer time frames. The Wasserstein distance successfully isolates temporal and depth variability and quantifies shifts in biogeochemical province boundaries. It also exposes relevant temporal trends in satellite chlorophyll consistent with climate change predictions. Our study shows that optimal transport vectors underlying the Wasserstein distance provide a novel visualization tool for testing models and better understanding temporal dynamics in the ocean.more » « less
-
Wasserstein Barycenter is a principled approach to represent the weighted mean of a given set of probability distributions, utilizing the geometry induced by optimal transport. In this work, we present a novel scalable algorithm to approximate the Wasserstein Barycenters aiming at highdimensional applications in machine learning. Our proposed algorithm is based on the Kantorovich dual formulation of the Wasserstein-2 distance as well as a recent neural network architecture, input convex neural network, that is known to parametrize convex functions. The distinguishing features of our method are: i) it only requires samples from the marginal distributions; ii) unlike the existing approaches, it represents the Barycenter with a generative model and can thus generate infinite samples from the barycenter without querying the marginal distributions; iii) it works similar to Generative Adversarial Model in one marginal case. We demonstrate the efficacy of our algorithm by comparing it with the state-of-art methods in multiple experiments.more » « less
-
We consider synthesis and analysis of probability measures using the entropy-regularized Wasserstein-2 cost and its unbiased version, the Sinkhorn divergence. The synthesis problem consists of computing the barycenter, with respect to these costs, of m reference measures given a set of coefficients belonging to the m-dimensional simplex. The analysis problem consists of finding the coefficients for the closest barycenter in the Wasserstein-2 distance to a given measure μ. Under the weakest assumptions on the measures thus far in the literature, we compute the derivative of the entropy-regularized Wasserstein-2 cost. We leverage this to establish a characterization of regularized barycenters as solutions to a fixed-point equation for the average of the entropic maps from the barycenter to the reference measures. This characterization yields a finite-dimensional, convex, quadratic program for solving the analysis problem when μ is a barycenter. It is shown that these coordinates, as well as the value of the barycenter functional, can be estimated from samples with dimension-independent rates of convergence, a hallmark of entropy-regularized optimal transport, and we verify these rates experimentally. We also establish that barycentric coordinates are stable with respect to perturbations in the Wasserstein-2 metric, suggesting a robustness of these coefficients to corruptions. We employ the barycentric coefficients as features for classification of corrupted point cloud data, and show that compared to neural network baselines, our approach is more efficient in small training data regimes.more » « less
-
Abstract Quantifying the structure and dynamics of species interactions in ecological communities is fundamental to studying ecology and evolution. While there are numerous approaches to analysing ecological networks, there is not yet an approach that can (1) quantify dissimilarity in the global structure of ecological networks that range from identical species and interaction composition to zero shared species or interactions and (2) map species between such networks while incorporating additional ecological information, such as species traits or abundances.To address these challenges, we introduce the use of optimal transport distances to quantify ecological network dissimilarity and functionally equivalent species between networks. Specifically, we describe the Gromov–Wasserstein (GW) and Fused Gromov–Wasserstein (FGW) distances. We apply these optimal transport methods to synthetic and empirical data, using mammal food webs throughout sub‐Saharan Africa for illustration. We showcase the application of GW and FGW distances to identify the most functionally similar species between food webs, incorporate additional trait information into network comparisons and quantify food web dissimilarity among geographic regions.Our results demonstrate that GW and FGW distances can effectively differentiate ecological networks based on their topological structure while identifying functionally equivalent species, even when networks have different species. The FGW distance further improves node mapping for basal species by incorporating node‐level traits. We show that these methods allow for a more nuanced understanding of the topological similarities in food web networks among geographic regions compared to an alternative measure of network dissimilarity based on species identities.Optimal transport distances offer a new approach for quantifying functional equivalence between networks and a measure of network dissimilarity suitable for a broader range of uses than existing approaches. OT methods can be harnessed to analyse ecological networks at large spatial scales and compare networks among ecosystems, realms or taxa. Optimal transport‐based distances, therefore, provide a powerful tool for analysing ecological networks with great potential to advance our understanding of ecological community structure and dynamics in a changing world.more » « less
An official website of the United States government

