We present a discretizationfree scalable framework for solving a large class of massconserving partial differential equations (PDEs), including the timedependent FokkerPlanck equation and the Wasserstein gradient flow. The main observation is that the timevarying velocity field of the PDE solution needs to be selfconsistent: it must satisfy a fixedpoint equation involving the probability flow characterized by the same velocity field. Instead of directly minimizing the residual of the fixedpoint equation with neural parameterization, we use an iterative formulation with a biased gradient estimator that bypasses significant computational obstacles with strong empirical performance. Compared to existing approaches, our method does not suffer from temporal or spatial discretization, covers a wider range of PDEs, and scales to high dimensions. Experimentally, our method recovers analytical solutions accurately when they are available and achieves superior performance in high dimensions with less training time compared to alternatives.
more »
« less
LargeScale Wasserstein Gradient Flows
Wasserstein gradient flows provide a powerful means of understanding and solving many diffusion equations. Specifically, FokkerPlanck equations, which model the diffusion of probability measures, can be understood as gradient descent over entropy functionals in Wasserstein space. This equivalence, introduced by Jordan, Kinderlehrer and Otto, inspired the socalled JKO scheme to approximate these diffusion processes via an implicit discretization of the gradient flow in Wasserstein space. Solving the optimization problem associated with each JKO step, however, presents serious computational challenges. We introduce a scalable method to approximate Wasserstein gradient flows, targeted to machine learning applications. Our approach relies on inputconvex neural networks (ICNNs) to discretize the JKO steps, which can be optimized by stochastic gradient descent. Contrarily to previous work, our method does not require domain discretization or particle simulation. As a result, we can sample from the measure at each time step of the diffusion and compute its probability density. We demonstrate the performance of our algorithm by computing diffusions following the FokkerPlanck equation and apply it to unnormalized density sampling as well as nonlinear filtering.
more »
« less
 Award ID(s):
 1838071
 NSFPAR ID:
 10310374
 Date Published:
 Journal Name:
 NeurIPS
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


We develop a new computational framework to solve the partial differential equations (PDEs) governing the flow of the joint probability density functions (PDFs) in continuoustime stochastic nonlinear systems. The need for computing the transient joint PDFs subject to prior dynamics arises in uncertainty propagation, nonlinear filtering and stochastic control. Our methodology breaks away from the traditional approach of spatial discretization or function approximation – both of which, in general, suffer from the “curseofdimensionality”. In the proposed framework, we discretize time but not the state space. We solve infinite dimensional proximal recursions in the manifold of joint PDFs, which in the small timestep limit, is theoretically equivalent to solving the underlying transport PDEs. The resulting computation has the geometric interpretation of gradient flow of certain free energy functional with respect to the Wasserstein metric arising from the theory of optimal mass transport. We show that dualization along with an entropic regularization, leads to a conepreserving fixed point recursion that is proved to be contractive in Thompson metric. A block coordinate iteration scheme is proposed to solve the resulting nonlinear recursions with guaranteed convergence. This approach enables remarkably fast computation for nonparametric transient joint PDF propagation. Numerical examples and various extensions are provided to illustrate the scope and efficacy of the proposed approach.more » « less

This paper studies computational methods for quasistationary distributions (QSDs). We first proposed a datadriven solver that solves Fokker–Planck equations for QSDs. Similar to the case of Fokker–Planck equations for invariant probability measures, we set up an optimization problem that minimizes the distance from a lowaccuracy reference solution, under the constraint of satisfying the linear relation given by the discretized Fokker–Planck operator. Then we use coupling method to study the sensitivity of a QSD against either the change of boundary condition or the diffusion coefficient. The 1Wasserstein distance between a QSD and the corresponding invariant probability measure can be quantitatively estimated. Some numerical results about both computation of QSDs and their sensitivity analysis are provided.more » « less

An approximate analytical solution is derived for a certain class of stochastic differential equations with constant diffusion, but nonlinear drift coefficients. Specifically, a closed form expression is derived for the response process transition probability density function (PDF) based on the concept of the Wiener path integral and on a Cauchy–Schwarz inequality treatment. This is done in conjunction with formulating and solving an error minimisation problem by relying on the associated Fokker–Planck equation operator. The developed technique, which requires minimal computational cost for the determination of the response process PDF, exhibits satisfactory accuracy and is capable of capturing the salient features of the PDF as demonstrated by comparisons with pertinent Monte Carlo simulation data. In addition to the mathematical merit of the approximate analytical solution, the derived PDF can be used also as a benchmark for assessing the accuracy of alternative, more computationally demanding, numerical solution techniques. Several examples are provided for assessing the reliability of the proposed approximation.more » « less

Abstract In this work we introduce semiimplicit or implicit finite difference schemes for the continuity equation with a gradient flow structure. Examples of such equations include the linear Fokker–Planck equation and the Keller–Segel equations. The two proposed schemes are firstorder accurate in time, explicitly solvable, and secondorder and fourthorder accurate in space, which are obtained via finite difference implementation of the classical continuous finite element method. The fully discrete schemes are proved to be positivity preserving and energy dissipative: the secondorder scheme can achieve so unconditionally while the fourthorder scheme only requires a mild time step and mesh size constraint. In particular, the fourthorder scheme is the first high order spatial discretization that can achieve both positivity and energy decay properties, which is suitable for long time simulation and to obtain accurate steady state solutions.more » « less