We propose a family of First Hitting Diffusion Models (FHDM), deep generative models that generate data with a diffusion process that terminates at a random first hitting time. This yields an extension of the standard fixed-time diffusion models that terminate at a pre-specified deterministic time. Although standard diffusion models are designed for continuous unconstrained data, FHDM is natu- rally designed to learn distributions on continuous as well as a range of discrete and structure domains. Moreover, FHDM enables instance-dependent terminate time and accelerates the diffusion process to sample higher quality data with fewer diffusion steps. Technically, we train FHDM by maximum likelihood estimation on diffusion trajectories augmented from observed data with conditional first hitting processes (i.e., bridge) derived based on Doob’s h-transform, deviating from the commonly used time-reversal mechanism. We apply FHDM to generate data in various domains such as point cloud (general continuous distribution), climate and geographical events on earth (continuous distribution on the sphere), unweighted graphs (distribution of binary matrices), and segmentation maps of 2D images (high-dimensional categorical distribution). We observe considerable improvement compared with the state-of-the-art approaches in both quality and speed.
more »
« less
LEARNING DIFFUSION BRIDGES ON CONSTRAINED DOMAINS
Diffusion models have achieved promising results on generative learning recently. However, because diffusion processes are most naturally applied on the uncon- strained Euclidean space Rd, key challenges arise for developing diffusion based models for learning data on constrained and structured domains. We present a simple and unified framework to achieve this that can be easily adopted to various types of domains, including product spaces of any type (be it bounded/unbounded, continuous/discrete, categorical/ordinal, or their mix). In our model, the diffu- sion process is driven by a drift force that is a sum of two terms: one singular force designed by Doob’s h-transform that ensures all outcomes of the process to belong to the desirable domain, and one non-singular neural force field that is trained to make sure the outcome follows the data distribution statistically. Ex- periments show that our methods perform superbly on generating tabular data, images, semantic segments and 3D point clouds. Code is available at https: //github.com/gnobitab/ConstrainedDiffusionBridge.
more »
« less
- Award ID(s):
- 1846421
- NSF-PAR ID:
- 10440560
- Date Published:
- Journal Name:
- international conference on learning representations (ICLR)
- ISSN:
- 1049-5258
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Effectively modeling and predicting the information cascades is at the core of understanding the information diffusion, which is essential for many related downstream applications, such as fake news detection and viral marketing identification. Conventional methods for cascade prediction heavily depend on the hypothesis of diffusion models and hand-crafted features. Owing to the significant recent successes of deep learning in multiple domains, attempts have been made to predict cascades by developing neural networks based approaches. However, the existing models are not capable of capturing both the underlying structure of a cascade graph and the node sequence in the diffusion process which, in turn, results in unsatisfactory prediction performance. In this paper, we propose a deep multi-task learning framework with a novel design of shared-representation layer to aid in explicitly understanding and predicting the cascades. As it turns out, the learned latent representation from the shared-representation layer can encode the structure and the node sequence of the cascade very well. Our experiments conducted on real-world datasets demonstrate that our method can significantly improve the prediction accuracy and reduce the computational cost compared to state-of-the-art baselines.more » « less
-
Oh, Alice ; Naumann, Tristan ; Globerson, Amir ; Saenko, Kate ; Hardt, Moritz ; Levine, Sergey (Ed.)Diffusion models have achieved great success in modeling continuous data modalities such as images, audio, and video, but have seen limited use in discrete domains such as language. Recent attempts to adapt diffusion to language have presented diffusion as an alternative to existing pretrained language models. We view diffusion and existing language models as complementary. We demonstrate that encoder-decoder language models can be utilized to efficiently learn high-quality language autoencoders. We then demonstrate that continuous diffusion models can be learned in the latent space of the language autoencoder, enabling us to sample continuous latent representations that can be decoded into natural language with the pretrained decoder. We validate the effectiveness of our approach for unconditional, class-conditional, and sequence-to-sequence language generation. We demonstrate across multiple diverse data sets that our latent language diffusion models are significantly more effective than previous diffusion language models. Our code is available at https://github.com/justinlovelace/latent-diffusion-for-language .more » « less
-
Saddle point search schemes are widely used to identify the transition state of different processes, like chemical reactions, surface and bulk diffusion, surface adsorption, and many more. In solid-state materials with relatively large numbers of atoms, the minimum mode following schemes such as dimer are commonly used because they alleviate the calculation of the Hessian on the high-dimensional potential energy surface. Here, we show that the dimer search can be further accelerated by leveraging Gaussian process regression (GPR). The GPR serves as a surrogate model to feed the dimer with the required energy and force input. We test the GPR-accelerated dimer method for predicting the diffusion coefficient of vacancy-mediated self-diffusion in body-centered cubic molybdenum and sulfur diffusion in hexagonal molybdenum disulfide. We use a multitask learning approach that utilizes a shared covariance function between energy and force input, and we show that the multitask learning significantly improves the performance of the GPR surrogate model compared to previously used learning approaches. Additionally, we demonstrate that a translation-hop sampling approach is necessary to avoid overfitting the GPR surrogate model to the minimum-mode-following pathway and thus succeeding in locating the saddle point. We show that our method reduces the number of evaluations compared to a conventional dimer method.more » « less
-
null (Ed.)Abstract: Deep Learning (DL) has made significant changes to a large number of research areas in recent decades. For example, several astonishing Convolutional Neural Network (CNN) models have been built by researchers to fulfill image classification needs using large-scale visual datasets successfully. Transfer Learning (TL) makes use of those pre-trained models to ease the feature learning process for other target domains that contain a smaller amount of training data. Currently, there are numerous ways to utilize features generated by transfer learning. Pre-trained CNN models prepare mid-/high-level features to work for different targeting problem domains. In this paper, a DL feature and model selection framework based on evolutionary programming is proposed to solve the challenges in visual data classification. It automates the process of discovering and obtaining the most representative features generated by the pre-trained DL models for different classification tasks.more » « less