NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Generalization in diffusion models arises from geometry-adaptive harmonic representations

Kadkhodaie, Zahra; Guth, Florentin; Simoncelli, Eero; Mallat, Stéphane (May 2024, International Conference on Learning Representations 2024)

Deep neural networks (DNNs) trained for image denoising are able to generate high-quality samples with score-based reverse diffusion algorithms. These impressive capabilities seem to imply an escape from the curse of dimensionality, but recent reports of memorization of the training set raise the question of whether these networks are learning the "true" continuous density of the data. Here, we show that two DNNs trained on non-overlapping subsets of a dataset learn nearly the same score function, and thus the same density, when the number of training images is large enough. In this regime of strong generalization, diffusion-generated images are distinct from the training set, and are of high visual quality, suggesting that the inductive biases of the DNNs are well-aligned with the data density. We analyze the learned denoising functions and show that the inductive biases give rise to a shrinkage operation in a basis adapted to the underlying image. Examination of these bases reveals oscillating harmonic structures along contours and in homogeneous regions. We demonstrate that trained denoisers are inductively biased towards these geometry-adaptive harmonic bases since they arise not only when the network is trained on photographic images, but also when it is trained on image classes supported on low-dimensional manifolds for which the harmonic basis is suboptimal. Finally, we show that when trained on regular image classes for which the optimal basis is known to be geometry-adaptive and harmonic, the denoising performance of the networks is near-optimal
more » « less
Full Text Available
Learning multi-scale local conditional probability models of images

Kadkhodaie, Zahra; Guth, Florentin; Mallat, Stéphane; Simoncelli, Eero P (March 2023, ICLR 2023)

Deep neural networks can learn powerful prior probability models for images, as evidenced by the high-quality generations obtained with recent score-based diffusion methods. But the means by which these networks capture complex global statistical structure, apparently without suffering from the curse of dimensionality, remain a mystery. To study this, we incorporate diffusion methods into a multi-scale decomposition, reducing dimensionality by assuming a stationary local Markov model for wavelet coefficients conditioned on coarser-scale coefficients. We instantiate this model using convolutional neural networks (CNNs) with local receptive fields, which enforce both the stationarity and Markov properties. Global structures are captured using a CNN with receptive fields covering the entire (but small) low-pass image. We test this model on a dataset of face images, which are highly non-stationary and contain large-scale geometric structures. Remarkably, denoising, super-resolution, and image synthesis results all demonstrate that these structures can be captured with significantly smaller conditioning neighborhoods than required by a Markov model implemented in the pixel domain. Our results show that score estimation for large complex images can be reduced to low-dimensional Markov conditional models across scales, alleviating the curse of dimensionality.
more » « less
Full Text Available
Predicting synthesizability of crystalline materials via deep learning

https://doi.org/10.1038/s43246-021-00219-x

Davariashtiyani, Ali; Kadkhodaie, Zahra; Kadkhodaei, Sara (December 2021, Communications Materials)

Abstract Predicting the synthesizability of hypothetical crystals is challenging because of the wide range of parameters that govern materials synthesis. Yet, exploring the exponentially large space of novel crystals for any future application demands an accurate predictive capability for synthesis likelihood to avoid a haphazard trial-and-error. Typically, benchmarks of synthesizability are defined based on the energy of crystal structures. Here, we take an alternative approach to select features of synthesizability from the latent information embedded in crystalline materials. We represent the atomic structure of crystalline materials by three-dimensional pixel-wise images that are color-coded by their chemical attributes. The image representation of crystals enables the use of a convolutional encoder to learn the features of synthesizability hidden in structural and chemical arrangements of crystalline materials. Based on the presented model, we can accurately classify materials into synthesizable crystals versus crystal anomalies across a broad range of crystal structure types and chemical compositions. We illustrate the usefulness of the model by predicting the synthesizability of hypothetical crystals for battery electrode and thermoelectric applications.
more » « less
Full Text Available
Stochastic Solutions for Linear Inverse Problems using the Prior Implicit in a Denoiser

Kadkhodaie, Zahra; Simoncelli, Eero (January 2021, Advances in neural information processing systems)

Deep neural networks have provided state-of-the-art solutions for problems such as image denoising, which implicitly rely on a prior probability model of natural images. Two recent lines of work – Denoising Score Matching and Plug-and-Play – propose methodologies for drawing samples from this implicit prior and using it to solve inverse problems, respectively. Here, we develop a parsimonious and robust generalization of these ideas. We rely on a classic statistical result that shows the least-squares solution for removing additive Gaussian noise can be written directly in terms of the gradient of the log of the noisy signal density. We use this to derive a stochastic coarse-to-fine gradient ascent procedure for drawing high-probability samples from the implicit prior embedded within a CNN trained to perform blind denoising. A generalization of this algorithm to constrained sampling provides a method for using the implicit prior to solve any deterministic linear inverse problem, with no additional training, thus extending the power of supervised learning for denoising to a much broader set of problems. The algorithm relies on minimal assumptions and exhibits robust convergence over a wide range of parameter choices. To demonstrate the generality of our method, we use it to obtain state-of-the-art levels of unsupervised performance for deblurring, super-resolution, and compressive sensing.
more » « less
Full Text Available

Search for: All records