skip to main content


Title: Analytical Probability Distributions and EM-Learning for Deep Generative Networks
Deep Generative Networks (DGNs) with probabilistic modeling of their output and latent space are currently trained via Variational Autoencoders (VAEs). In the absence of a known analytical form for the posterior and likelihood expectation, VAEs resort to approximations, including (Amortized) Variational Inference (AVI) and Monte-Carlo (MC) sampling. We exploit the Continuous Piecewise Affine (CPA) property of modern DGNs to derive their posterior and marginal distributions as well as the latter's first moments. These findings enable us to derive an analytical Expectation-Maximization (EM) algorithm that enables gradient-free DGN learning. We demonstrate empirically that EM training of DGNs produces greater likelihood than VAE training. Our findings will guide the design of new VAE AVI that better approximate the true posterior and open avenues to apply standard statistical tools for model comparison, anomaly detection, and missing data imputation.  more » « less
Award ID(s):
1911094 1838177 1730574
NSF-PAR ID:
10198633
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Advances in neural information processing systems
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Deep Generative Networks (DGNs) with probabilistic modeling of their output and latent space are currently trained via Variational Autoencoders (VAEs). In the absence of a known analytical form for the posterior and likelihood expectation, VAEs resort to approximations, including (Amortized) Variational Inference (AVI) and Monte-Carlo (MC) sampling. We exploit the Continuous Piecewise Affine (CPA) property of modern DGNs to derive their posterior and marginal distributions as well as the latter's first moments. These findings enable us to derive an analytical Expectation-Maximization (EM) algorithm that enables gradient-free DGN learning. We demonstrate empirically that EM training of DGNs produces greater likelihood than VAE training. Our findings will guide the design of new VAE AVI that better approximate the true posterior and open avenues to apply standard statistical tools for model comparison, anomaly detection, and missing data imputation. 
    more » « less
  2. In recent years, the field of machine learning has made phenomenal progress in the pursuit of simulating real-world data generation processes. One notable example of such success is the variational autoencoder (VAE). In this work, with a small shift in perspective, we leverage and adapt VAEs for a different purpose: uncertainty quantification in scientific inverse problems. We introduce UQ-VAE: a flexible, adaptive, hybrid data/model-informed framework for training neural networks capable of rapid modelling of the posterior distribution representing the unknown parameter of interest. Specifically, from divergence-based variational inference, our framework is derived such that most of the information usually present in scientific inverse problems is fully utilized in the training procedure. Additionally, this framework includes an adjustable hyperparameter that allows selection of the notion of distance between the posterior model and the target distribution. This introduces more flexibility in controlling how optimization directs the learning of the posterior model. Further, this framework possesses an inherent adaptive optimization property that emerges through the learning of the posterior uncertainty. 
    more » « less
  3. Joan Bruna, Jan S (Ed.)
    In recent years, the field of machine learning has made phenomenal progress in the pursuit of simulating real-world data generation processes. One notable example of such success is the variational autoencoder (VAE). In this work, with a small shift in perspective, we leverage and adapt VAEs for a different purpose: uncertainty quantification in scientific inverse problems. We introduce UQ-VAE: a flexible, adaptive, hybrid data/model-constrained framework for training neural networks capable of rapid modelling of the posterior distribution representing the unknown parameter of interest. Specifically, from divergence-based variational inference, our framework is derived such that most of the information usually present in scientific inverse problems is fully utilized in the training procedure. Additionally, this framework includes an adjustable hyperparameter that allows selection of the notion of distance between the posterior model and the target distribution. This introduces more flexibility in controlling how optimization directs the learning of the posterior model. Further, this framework possesses an inherent adaptive optimization property that emerges through the learning of the posterior uncertainty. Numerical results for an elliptic PDE-constrained Bayesian inverse problem are provided to verify the proposed framework. 
    more » « less
  4. Generative models, such as Variational Autoencoders (VAEs), are increasingly employed for atypical pattern detection in brain imaging. During training, these models learn to capture the underlying patterns within “normal” brain images and generate new samples from those patterns. Neurodivergent states can be observed by measuring the dissimilarity between the generated/reconstructed images and the input images. This paper leverages VAEs to conduct Functional Connectivity (FC) analysis from functional Magnetic Resonance Imaging (fMRI) scans of individuals with Autism Spectrum Disorder (ASD), aiming to uncover atypical interconnectivity between brain regions. In the first part of our study, we compare multiple VAE architectures—Conditional VAE, Recurrent VAE, and a hybrid of CNN parallel with RNN VAE—aiming to establish the effectiveness of VAEs in application FC analysis. Given the nature of the disorder, ASD exhibits a higher prevalence among males than females. Therefore, in the second part of this paper, we investigate if introducing phenotypic data could improve the performance of VAEs and, consequently, FC analysis. We compare our results with the findings from previous studies in the literature. The results showed that CNN-based VAE architecture is more effective for this application than the other models.

     
    more » « less
  5. We propose a novel algorithm for quantizing continuous latent representations in trained models. Our approach applies to deep probabilistic models, such as variational autoencoders (VAEs), and enables both data and model compression. Unlike current end-to-end neural compression methods that cater the model to a fixed quantization scheme, our algorithm separates model design and training from quantization. Consequently, our algorithm enables “plug-and-play” compression at variable rate-distortion trade-off, using a single trained model. Our algorithm can be seen as a novel extension of arithmetic coding to the continuous domain, and uses adaptive quantization accuracy based on estimates of posterior uncertainty. Our experimental results demonstrate the importance of taking into account posterior uncertainties, and show that image compression with the proposed algorithm outperforms JPEG over a wide range of bit rates using only a single standard VAE. Further experiments on Bayesian neural word embeddings demonstrate the versatility of the proposed method. 
    more » « less