skip to main content


Title: Large-Scale Wasserstein Gradient Flows
Wasserstein gradient flows provide a powerful means of understanding and solving many diffusion equations. Specifically, Fokker-Planck equations, which model the diffusion of probability measures, can be understood as gradient descent over entropy functionals in Wasserstein space. This equivalence, introduced by Jordan, Kinderlehrer and Otto, inspired the so-called JKO scheme to approximate these diffusion processes via an implicit discretization of the gradient flow in Wasserstein space. Solving the optimization problem associated with each JKO step, however, presents serious computational challenges. We introduce a scalable method to approximate Wasserstein gradient flows, targeted to machine learning applications. Our approach relies on input-convex neural networks (ICNNs) to discretize the JKO steps, which can be optimized by stochastic gradient descent. Contrarily to previous work, our method does not require domain discretization or particle simulation. As a result, we can sample from the measure at each time step of the diffusion and compute its probability density. We demonstrate the performance of our algorithm by computing diffusions following the Fokker-Planck equation and apply it to unnormalized density sampling as well as nonlinear filtering.  more » « less
Award ID(s):
1838071
NSF-PAR ID:
10310374
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
NeurIPS
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We present a discretization-free scalable framework for solving a large class of mass-conserving partial differential equations (PDEs), including the time-dependent Fokker-Planck equation and the Wasserstein gradient flow. The main observation is that the time-varying velocity field of the PDE solution needs to be self-consistent: it must satisfy a fixed-point equation involving the probability flow characterized by the same velocity field. Instead of directly minimizing the residual of the fixed-point equation with neural parameterization, we use an iterative formulation with a biased gradient estimator that bypasses significant computational obstacles with strong empirical performance. Compared to existing approaches, our method does not suffer from temporal or spatial discretization, covers a wider range of PDEs, and scales to high dimensions. Experimentally, our method recovers analytical solutions accurately when they are available and achieves superior performance in high dimensions with less training time compared to alternatives. 
    more » « less
  2. We develop a new computational framework to solve the partial differential equations (PDEs) governing the flow of the joint probability density functions (PDFs) in continuous-time stochastic nonlinear systems. The need for computing the transient joint PDFs subject to prior dynamics arises in uncertainty propagation, nonlinear filtering and stochastic control. Our methodology breaks away from the traditional approach of spatial discretization or function approximation – both of which, in general, suffer from the “curse-of-dimensionality”. In the proposed framework, we discretize time but not the state space. We solve infinite dimensional proximal recursions in the manifold of joint PDFs, which in the small time-step limit, is theoretically equivalent to solving the underlying transport PDEs. The resulting computation has the geometric interpretation of gradient flow of certain free energy functional with respect to the Wasserstein metric arising from the theory of optimal mass transport. We show that dualization along with an entropic regularization, leads to a cone-preserving fixed point recursion that is proved to be contractive in Thompson metric. A block co-ordinate iteration scheme is proposed to solve the resulting nonlinear recursions with guaranteed convergence. This approach enables remarkably fast computation for non-parametric transient joint PDF propagation. Numerical examples and various extensions are provided to illustrate the scope and efficacy of the proposed approach. 
    more » « less
  3. This paper studies computational methods for quasi-stationary distributions (QSDs). We first proposed a data-driven solver that solves Fokker–Planck equations for QSDs. Similar to the case of Fokker–Planck equations for invariant probability measures, we set up an optimization problem that minimizes the distance from a low-accuracy reference solution, under the constraint of satisfying the linear relation given by the discretized Fokker–Planck operator. Then we use coupling method to study the sensitivity of a QSD against either the change of boundary condition or the diffusion coefficient. The 1-Wasserstein distance between a QSD and the corresponding invariant probability measure can be quantitatively estimated. Some numerical results about both computation of QSDs and their sensitivity analysis are provided. 
    more » « less
  4. An approximate analytical solution is derived for a certain class of stochastic differential equations with constant diffusion, but nonlinear drift coefficients. Specifically, a closed form expression is derived for the response process transition probability density function (PDF) based on the concept of the Wiener path integral and on a Cauchy–Schwarz inequality treatment. This is done in conjunction with formulating and solving an error minimisation problem by relying on the associated Fokker–Planck equation operator. The developed technique, which requires minimal computational cost for the determination of the response process PDF, exhibits satisfactory accuracy and is capable of capturing the salient features of the PDF as demonstrated by comparisons with pertinent Monte Carlo simulation data. In addition to the mathematical merit of the approximate analytical solution, the derived PDF can be used also as a benchmark for assessing the accuracy of alternative, more computationally demanding, numerical solution techniques. Several examples are provided for assessing the reliability of the proposed approximation. 
    more » « less
  5. Abstract In this work we introduce semi-implicit or implicit finite difference schemes for the continuity equation with a gradient flow structure. Examples of such equations include the linear Fokker–Planck equation and the Keller–Segel equations. The two proposed schemes are first-order accurate in time, explicitly solvable, and second-order and fourth-order accurate in space, which are obtained via finite difference implementation of the classical continuous finite element method. The fully discrete schemes are proved to be positivity preserving and energy dissipative: the second-order scheme can achieve so unconditionally while the fourth-order scheme only requires a mild time step and mesh size constraint. In particular, the fourth-order scheme is the first high order spatial discretization that can achieve both positivity and energy decay properties, which is suitable for long time simulation and to obtain accurate steady state solutions. 
    more » « less