skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on January 21, 2026

Title: AdjointDEIS: Efficient Gradients for Diffusion Models
The optimization of the latents and parameters of diffusion models with respect to some differentiable metric defined on the output of the model is a challenging and complex problem. The sampling for diffusion models is done by solving either the probability flow ODE or diffusion SDE wherein a neural network approximates the score function allowing a numerical ODE/SDE solver to be used. However, naïve backpropagation techniques are memory intensive, requiring the storage of all intermediate states, and face additional complexity in handling the injected noise from the diffusion term of the diffusion SDE. We propose a novel family of bespoke ODE solvers to the continuous adjoint equations for diffusion models, which we call AdjointDEIS. We exploit the unique construction of diffusion SDEs to further simplify the formulation of the continuous adjoint equations using exponential integrators. Moreover, we provide convergence order guarantees for our bespoke solvers. Significantly, we show that the continuous adjoint equations for diffusion SDEs actually simplify to a simple ODE. Lastly, we demonstrate the effectiveness of AdjointDEIS for guided generation with an adversarial attack in the form of the face morphing problem. Our code will be released at https: //github.com/zblasingame/AdjointDEIS.  more » « less
Award ID(s):
2413228
PAR ID:
10608842
Author(s) / Creator(s):
;
Publisher / Repository:
Advances in Neural Information Processing Systems
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Although the governing equations of many systems, when derived from first principles, may be viewed as known, it is often too expensive to numerically simulate all the interactions they describe. Therefore, researchers often seek simpler descriptions that describe complex phenomena without numerically resolving all the interacting components. Stochastic differential equations (SDEs) arise naturally as models in this context. The growth in data acquisition, both through experiment and through simulations, provides an opportunity for the systematic derivation of SDE models in many disciplines. However, inconsistencies between SDEs and real data at short time scales often cause problems, when standard statistical methodology is applied to parameter estimation. The incompatibility between SDEs and real data can be addressed by deriving sufficient statistics from the time-series data and learning parameters of SDEs based on these. Here, we study sufficient statistics computed from time averages, an approach that we demonstrate to lead to sufficient statistics on a variety of problems and that has the secondary benefit of obviating the need to match trajectories. Following this approach, we formulate the fitting of SDEs to sufficient statistics from real data as an inverse problem and demonstrate that this inverse problem can be solved by using ensemble Kalman inversion. Furthermore, we create a framework for non-parametric learning of drift and diffusion terms by introducing hierarchical, refinable parameterizations of unknown functions, using Gaussian process regression. We demonstrate the proposed methodology for the fitting of SDE models, first in a simulation study with a noisy Lorenz ’63 model, and then in other applications, including dimension reduction in deterministic chaotic systems arising in the atmospheric sciences, large-scale pattern modeling in climate dynamics and simplified models for key observables arising in molecular dynamics. The results confirm that the proposed methodology provides a robust and systematic approach to fitting SDE models to real data. 
    more » « less
  2. We propose heavy ball neural ordinary differential equations (HBNODEs), leveraging the continuous limit of the classical momentum accelerated gradient descent, to improve neural ODEs (NODEs) training and inference. HBNODEs have two properties that imply practical advantages over NODEs: (i) The adjoint state of an HBNODE also satisfies an HBNODE, accelerating both forward and backward ODE solvers, thus significantly reducing the number of function evaluations (NFEs) and improving the utility of the trained models. (ii) The spectrum of HBNODEs is well structured, enabling effective learning of long-term dependencies from complex sequential data. We verify the advantages of HBNODEs over NODEs on benchmark tasks, including image classification, learning complex dynamics, and sequential modeling. Our method requires remarkably fewer forward and backward NFEs, is more accurate, and learns long-term dependencies more effectively than the other ODE-based neural network models. Code is available at https://github.com/hedixia/HeavyBallNODE. 
    more » « less
  3. We study the ergodic properties of a class of controlled stochastic differential equations (SDEs) driven by a-stable processes which arise as the limiting equations of multiclass queueing models in the Halfin–Whitt regime that have heavy–tailed arrival processes. When the safety staffing parameter is positive, we show that the SDEs are uniformly ergodic and enjoy a polynomial rate of convergence to the invariant probability measure in total variation, which is uniform over all stationary Markov controls resulting in a locally Lipschitz continuous drift. We also derive a matching lower bound on the rate of convergence (under no abandonment). On the other hand, when all abandonment rates are positive, we show that the SDEs are exponentially ergodic uniformly over the above-mentioned class of controls. Analogous results are obtained for Lévy–driven SDEs arising from multiclass many-server queues under asymptotically negligible service interruptions. For these equations, we show that the aforementioned ergodic properties are uniform over all stationary Markov controls. We also extend a key functional central limit theorem concerning diffusion approximations so as to make it applicable to the models studied here. 
    more » « less
  4. We consider linear and nonlinear transport equations with irregular velocity fields, motivated by models coming from mean field games. The velocity fields are assumed to increase in each coordinate, and the divergence therefore fails to be absolutely continuous with respect to the Lebesgue measure in general. For such velocity fields, the well-posedness of first- and second-order linear transport equations in Lebesgue spaces is established, as well as the existence and uniqueness of regular ODE and SDE Lagrangian flows. These results are then applied to the study of certain nonconservative, nonlinear systems of transport type, which are used to model mean field games in a finite state space. A notion of weak solution is identified for which unique minimal and maximal solutions exist, which do not coincide in general. A selection-by-noise result is established for a relevant example to demonstrate that different types of noise can select any of the admissible solutions in the vanishing noise limit. 
    more » « less
  5. Diffusion models (DMs) create samples from a data distribution by starting from random noise and iteratively solving a reverse-time ordinary differential equation (ODE). Because each step in the iterative solution requires an expensive neural function evaluation (NFE), there has been significant interest in approximately solving these diffusion ODEs with only a few NFEs without modifying the underlying model. However, in the few NFE regime, we observe that tracking the true ODE evolution is fundamentally impossible using traditional ODE solvers. In this work, we propose a new method that learns a good solver for the DM, which we call Solving for the Solver (S4S). S4S directly optimizes a solver to obtain good generation quality by learning to match the output of a strong teacher solver. We evaluate S4S on six different pre-trained DMs, including pixel-space and latent-space DMs for both conditional and unconditional sampling. In all settings, S4S uniformly improves the sample quality relative to traditional ODE solvers. Moreover, our method is lightweight, data-free, and can be plugged in black-box on top of any discretization schedule or architecture to improve performance. Building on top of this, we also propose S4S-Alt, which optimizes both the solver and the discretization schedule. By exploiting the full design space of DM solvers, with 5 NFEs, we achieve an FID of 3.73 on CIFAR10 and 13.26 on MS-COCO, representing a 1.5× improvement over previous training-free ODE methods. 
    more » « less