skip to main content

This content will become publicly available on October 1, 2024

Title: Learning Energy-Based Models in High-Dimensional Spaces with Multiscale Denoising-Score Matching

Energy-based models (EBMs) assign an unnormalized log probability to data samples. This functionality has a variety of applications, such as sample synthesis, data denoising, sample restoration, outlier detection, Bayesian reasoning and many more. But, the training of EBMs using standard maximum likelihood is extremely slow because it requires sampling from the model distribution. Score matching potentially alleviates this problem. In particular, denoising-score matching has been successfully used to train EBMs. Using noisy data samples with one fixed noise level, these models learn fast and yield good results in data denoising. However, demonstrations of such models in the high-quality sample synthesis of high-dimensional data were lacking. Recently, a paper showed that a generative model trained by denoising-score matching accomplishes excellent sample synthesis when trained with data samples corrupted with multiple levels of noise. Here we provide an analysis and empirical evidence showing that training with multiple noise levels is necessary when the data dimension is high. Leveraging this insight, we propose a novel EBM trained with multiscale denoising-score matching. Our model exhibits a data-generation performance comparable to state-of-the-art techniques such as GANs and sets a new baseline for EBMs. The proposed model also provides density information and performs well on an image-inpainting task.

more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Publisher / Repository:
Date Published:
Journal Name:
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. While energy-based models (EBMs) exhibit a number of desirable properties, training and sampling on high-dimensional datasets remains challenging. Inspired by recent progress on diffusion probabilistic models, we present a diffusion re- covery likelihood method to tractably learn and sample from a sequence of EBMs trained on increasingly noisy versions of a dataset. Each EBM is trained with recovery likelihood, which maximizes the conditional probability of the data at a certain noise level given their noisy versions at a higher noise level. Optimizing re- covery likelihood is more tractable than marginal likelihood, as sampling from the conditional distributions is much easier than sampling from the marginal distribu- tions. After training, synthesized images can be generated by the sampling process that initializes from Gaussian white noise distribution and progressively samples the conditional distributions at decreasingly lower noise levels. Our method gener- ates high fidelity samples on various image datasets. On unconditional CIFAR-10 our method achieves FID 9.58 and inception score 8.30, superior to the majority of GANs. Moreover, we demonstrate that unlike previous work on EBMs, our long-run MCMC samples from the conditional distributions do not diverge and still represent realistic images, allowing us to accurately estimate the normalized density of data even for high-dimensional datasets. Our implementation is avail- able at 
    more » « less
  2. Deep neural networks have provided state-of-the-art solutions for problems such as image denoising, which implicitly rely on a prior probability model of natural images. Two recent lines of work – Denoising Score Matching and Plug-and-Play – propose methodologies for drawing samples from this implicit prior and using it to solve inverse problems, respectively. Here, we develop a parsimonious and robust generalization of these ideas. We rely on a classic statistical result that shows the least-squares solution for removing additive Gaussian noise can be written directly in terms of the gradient of the log of the noisy signal density. We use this to derive a stochastic coarse-to-fine gradient ascent procedure for drawing high-probability samples from the implicit prior embedded within a CNN trained to perform blind denoising. A generalization of this algorithm to constrained sampling provides a method for using the implicit prior to solve any deterministic linear inverse problem, with no additional training, thus extending the power of supervised learning for denoising to a much broader set of problems. The algorithm relies on minimal assumptions and exhibits robust convergence over a wide range of parameter choices. To demonstrate the generality of our method, we use it to obtain state-of-the-art levels of unsupervised performance for deblurring, super-resolution, and compressive sensing. 
    more » « less
  3. Simulation-free methods for training continuous-time generative models construct probability paths that go between noise distributions and individual data samples. Recent works, such as Flow Matching, derived paths that are optimal for each data sample. However, these algorithms rely on independent data and noise samples, and do not exploit underlying structure in the data distribution for constructing probability paths. We propose Multisample Flow Matching, a more general framework that uses non-trivial couplings between data and noise samples while satisfying the correct marginal constraints. At very small overhead costs, this generalization allows us to (i) reduce gradient variance during training, (ii) obtain straighter flows for the learned vector field, which allows us to generate high-quality samples using fewer function evaluations, and (iii) obtain transport maps with lower cost in high dimensions, which has applications beyond generative modeling. Importantly, we do so in a completely simulation-free manner with a simple minimization objective. We show that our proposed methods improve sample consistency on downsampled ImageNet data sets, and lead to better low-cost sample generation. 
    more » « less
  4. Yap, Pew-Thian (Ed.)
    Diffusion weighted imaging (DWI) with multiple, high b-values is critical for extracting tissue microstructure measurements; however, high b-value DWI images contain high noise levels that can overwhelm the signal of interest and bias microstructural measurements. Here, we propose a simple denoising method that can be applied to any dataset, provided a low-noise, single-subject dataset is acquired using the same DWI sequence. The denoising method uses a one-dimensional convolutional neural network (1D-CNN) and deep learning to learn from a low-noise dataset, voxel-by-voxel. The trained model can then be applied to high-noise datasets from other subjects. We validated the 1D-CNN denoising method by first demonstrating that 1D-CNN denoising resulted in DWI images that were more similar to the noise-free ground truth than comparable denoising methods, e.g., MP-PCA, using simulated DWI data. Using the same DWI acquisition but reconstructed with two common reconstruction methods, i.e. SENSE1 and sum-of-square, to generate a pair of low-noise and high-noise datasets, we then demonstrated that 1D-CNN denoising of high-noise DWI data collected from human subjects showed promising results in three domains: DWI images, diffusion metrics, and tractography. In particular, the denoised images were very similar to a low-noise reference image of that subject, more than the similarity between repeated low-noise images (i.e. computational reproducibility). Finally, we demonstrated the use of the 1D-CNN method in two practical examples to reduce noise from parallel imaging and simultaneous multi-slice acquisition. We conclude that the 1D-CNN denoising method is a simple, effective denoising method for DWI images that overcomes some of the limitations of current state-of-the-art denoising methods, such as the need for a large number of training subjects and the need to account for the rectified noise floor. 
    more » « less
  5. null (Ed.)
    Deep convolutional neural networks (CNNs) for image denoising are usually trained on large datasets. These models achieve the current state of the art, but they have difficulties generalizing when applied to data that deviate from the training distribution. Recent work has shown that it is possible to train denoisers on a single noisy image. These models adapt to the features of the test image, but their performance is limited by the small amount of information used to train them. Here we propose "GainTuning", in which CNN models pre-trained on large datasets are adaptively and selectively adjusted for individual test images. To avoid overfitting, GainTuning optimizes a single multiplicative scaling parameter (the "Gain") of each channel in the convolutional layers of the CNN. We show that GainTuning improves state-of-the-art CNNs on standard image-denoising benchmarks, boosting their denoising performance on nearly every image in a held-out test set. These adaptive improvements are even more substantial for test images differing systematically from the training data, either in noise level or image type. We illustrate the potential of adaptive denoising in a scientific application, in which a CNN is trained on synthetic data, and tested on real transmission-electron-microscope images. In contrast to the existing methodology, GainTuning is able to faithfully reconstruct the structure of catalytic nanoparticles from these data at extremely low signal-to-noise ratios. 
    more » « less