skip to main content

Title: OT-Flow: Fast and Accurate Continuous Normalizing Flows via Optimal Transport
A normalizing flow is an invertible mapping between an arbitrary probability distribution and a standard normal distribution; it can be used for density estimation and statistical inference. Computing the flow follows the change of variables formula and thus requires invertibility of the mapping and an efficient way to compute the determinant of its Jacobian. To satisfy these requirements, normalizing flows typically consist of carefully chosen components. Continuous normalizing flows (CNFs) are mappings obtained by solving a neural ordinary differential equation (ODE). The neural ODE's dynamics can be chosen almost arbitrarily while ensuring invertibility. Moreover, the log-determinant of the flow's Jacobian can be obtained by integrating the trace of the dynamics' Jacobian along the flow. Our proposed OT-Flow approach tackles two critical computational challenges that limit a more widespread use of CNFs. First, OT-Flow leverages optimal transport (OT) theory to regularize the CNF and enforce straight trajectories that are easier to integrate. Second, OT-Flow features exact trace computation with time complexity equal to trace estimators used in existing CNFs. On five high-dimensional density estimation and generative modeling tasks, OT-Flow performs competitively to state-of-the-art CNFs while on average requiring one-fourth of the number of weights with an 8x speedup in training time more » and 24x speedup in inference. « less
; ; ;
Award ID(s):
Publication Date:
Journal Name:
Proceedings of the AAAI Conference on Artificial Intelligence
Sponsoring Org:
National Science Foundation
More Like this
  1. Triangular flows, also known as Knöthe-Rosenblatt measure couplings, comprise an important building block of normalizing flow models for generative modeling and density estimation, including popular autoregressive flows such as real-valued non-volume preserving transformation models (Real NVP). We present statistical guarantees and sample complexity bounds for triangular flow statistical models. In particular, we establish the statistical consistency and the finite sample convergence rates of the minimum Kullback-Leibler divergence statistical estimator of the Knöthe-Rosenblatt measure coupling using tools from empirical process theory. Our results highlight the anisotropic geometry of function classes at play in triangular flows, shed light on optimal coordinate ordering, and lead to statistical guarantees for Jacobian flows. We conduct numerical experiments to illustrate the practical implications of our theoretical findings.
  2. Larochelle, Hugo ; Ranzato, Marc'Aurelio ; Hadsell, Raia ; Balcan, Maria-Florina ; Lin, Hsuan-Tien (Ed.)
    To better conform to data geometry, recent deep generative modelling techniques adapt Euclidean constructions to non-Euclidean spaces. In this paper, we study normalizing flows on manifolds. Previous work has developed flow models for specific cases; however, these advancements hand craft layers on a manifold-by-manifold basis, restricting generality and inducing cumbersome design constraints. We overcome these issues by introducing Neural Manifold Ordinary Differential Equations, a manifold generalization of Neural ODEs, which enables the construction of Manifold Continuous Normalizing Flows (MCNFs). MCNFs require only local geometry (therefore generalizing to arbitrary manifolds) and compute probabilities with continuous change of variables (allowing for a simple and expressive flow construction). We find that leveraging continuous manifold dynamics produces a marked improvement for both density estimation and downstream tasks.
  3. Abstract We introduce deep learning models to estimate the masses of the binary components of black hole mergers, ( m 1 , m 2 ) , and three astrophysical properties of the post-merger compact remnant, namely, the final spin, a f , and the frequency and damping time of the ringdown oscillations of the fundamental ℓ = m = 2 bar mode, ( ω R , ω I ) . Our neural networks combine a modified WaveNet architecture with contrastive learning and normalizing flow. We validate these models against a Gaussian conjugate prior family whose posterior distribution is described by a closed analytical expression. Upon confirming that our models produce statistically consistent results, we used them to estimate the astrophysical parameters ( m 1 , m 2 , a f , ω R , ω I ) of five binary black holes: GW150914 , GW170104 , GW170814 , GW190521 and GW190630 . We use PyCBC Inference to directly compare traditional Bayesian methodologies for parameter estimation with our deep learning based posterior distributions. Our results show that our neural network models predict posterior distributions that encode physical correlations, and that our data-driven median results and 90% confidence intervals are similar tomore »those produced with gravitational wave Bayesian analyses. This methodology requires a single V100 NVIDIA GPU to produce median values and posterior distributions within two milliseconds for each event. This neural network, and a tutorial for its use, are available at the Data and Learning Hub for Science .« less
  4. Abstract We investigate the feasibility of in-laboratory tomographic X-ray particle tracking velocimetry (TXPTV) and consider creeping flows with nearly density matched flow tracers. Specifically, in these proof-of-concept experiments we examined a Poiseuille flow, flow through porous media and a multiphase flow with a Taylor bubble. For a full 360 $$^\circ$$ ∘ computed tomography (CT) scan we show that the specially selected 60 micron tracer particles could be imaged in less than 3 seconds with a signal-to-noise ratio between the tracers and the fluid of 2.5, sufficient to achieve proper volumetric segmentation at each time step. In the pipe flow, continuous Lagrangian particle trajectories were obtained, after which all the standard techniques used for PTV or PIV (taken at visible wave lengths) could also be employed for TXPTV data. And, with TXPTV we can examine flows inaccessible with visible wave lengths due to opaque media or numerous refractive interfaces. In the case of opaque porous media we were able to observe material accumulation and pore clogging, and for flow with Taylor bubble we can trace the particles and hence obtain velocities in the liquid film between the wall and bubble, with thickness of liquid film itself also simultaneously obtained from themore »volumetric reconstruction after segmentation. While improvements in scan speed are anticipated due to continuing improvements in CT system components, we show that for the flows examined even the presently available CT systems could yield quantitative flow data with the primary limitation being the quality of available flow tracers. Graphic abstract« less
  5. Fast inference of numerical model parameters from data is an important prerequisite to generate predictive models for a wide range of applications. Use of sampling-based approaches such as Markov chain Monte Carlo may become intractable when each likelihood evaluation is computationally expensive. New approaches combining variational inference with normalizing flow are characterized by a computational cost that grows only linearly with the dimensionality of the latent variable space, and rely on gradient-based optimization instead of sampling, providing a more efficient approach for Bayesian inference about the model parameters. Moreover, the cost of frequently evaluating an expensive likelihood can be mitigated by replacing the true model with an offline trained surrogate model, such as neural networks. However, this approach might generate significant bias when the surrogate is insufficiently accurate around the posterior modes. To reduce the computational cost without sacrificing inferential accuracy, we propose Normalizing Flow with Adaptive Surrogate (NoFAS), an optimization strategy that alternatively updates the normalizing flow parameters and surrogate model parameters. We also propose an efficient sample weighting scheme for surrogate model training that preserves global accuracy while effectively capturing high posterior density regions. We demonstrate the inferential and computational superiority of NoFAS against various benchmarks, including casesmore »where the underlying model lacks identifiability. The source code and numerical experiments used for this study are available at« less