Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

Algorithms for meanfield variational inference via polyhedral optimization in the Wasserstein spaceWe develop a theory of finitedimensional polyhedral subsets over the Wasserstein space and optimization of functionals over them via firstorder methods. Our main application is to the problem of meanfield variational inference, which seeks to approximate a distribution π over ℝd by a product measure π⋆. When π is strongly logconcave and logsmooth, we provide (1) approximation rates certifying that π⋆ is close to the minimizer π⋆⋄ of the KL divergence over a \emph{polyhedral} set ⋄, and (2) an algorithm for minimizing KL(⋅‖π) over ⋄ with accelerated complexity O(κ√log(κd/ε2)), where κ is the condition number of π.more » « lessFree, publiclyaccessible full text available June 30, 2025

Krause, Andreas ; Brunskill, Emma ; Cho, Kyunghyun ; Engelhardt, Barbara ; Sabato, Sivan ; Scarlett, Jonathan (Ed.)We consider the problem of estimating the optimal transport map between two probability distributions, P and Q in R^d, on the basis of i.i.d. samples. All existing statistical analyses of this problem require the assumption that the transport map is Lipschitz, a strong requirement that, in particular, excludes any examples where the transport map is discontinuous. As a first step towards developing estimation procedures for discontinuous maps, we consider the important special case where the data distribution Q is a discrete measure supported on a finite number of points in R^d. We study a computationally efficient estimator initially proposed by Pooladian & NilesWeed (2021), based on entropic optimal transport, and show in the semidiscrete setting that it converges at the minimaxoptimal rate n^{−1/2}, independent of dimension. Other standard map estimation techniques both lack finitesample guarantees in this setting and provably suffer from the curse of dimensionality. We confirm these results in numerical experiments, and provide experiments for other settings, not covered by our theory, which indicate that the entropic estimator is a promising methodology for other discontinuous transport map estimation problems.more » « less

We consider the problem of estimating the optimal transport map between two probability distributions, P and Q in Rd, on the basis of i.i.d. samples. All existing statistical analyses of this problem require the assumption that the transport map is Lipschitz, a strong requirement that, in particular, excludes any examples where the transport map is discontinuous. As a first step towards developing estimation procedures for discontinuous maps, we consider the important special case where the data distribution Q is a discrete measure supported on a finite number of points in Rd. We study a computationally efficient estimator initially proposed by Pooladian and NilesWeed (2021), based on entropic optimal transport, and show in the semidiscrete setting that it converges at the minimaxoptimal rate n−1/2, independent of dimension. Other standard map estimation techniques both lack finitesample guarantees in this setting and provably suffer from the curse of dimensionality. We confirm these results in numerical experiments, and provide experiments for other settings, not covered by our theory, which indicate that the entropic estimator is a promising methodology for other discontinuous transport map estimation problems.more » « less

Simulationfree methods for training continuoustime generative models construct probability paths that go between noise distributions and individual data samples. Recent works, such as Flow Matching, derived paths that are optimal for each data sample. However, these algorithms rely on independent data and noise samples, and do not exploit underlying structure in the data distribution for constructing probability paths. We propose Multisample Flow Matching, a more general framework that uses nontrivial couplings between data and noise samples while satisfying the correct marginal constraints. At very small overhead costs, this generalization allows us to (i) reduce gradient variance during training, (ii) obtain straighter flows for the learned vector field, which allows us to generate highquality samples using fewer function evaluations, and (iii) obtain transport maps with lower cost in high dimensions, which has applications beyond generative modeling. Importantly, we do so in a completely simulationfree manner with a simple minimization objective. We show that our proposed methods improve sample consistency on downsampled ImageNet data sets, and lead to better lowcost sample generation.more » « less