We propose an algorithm, guided variational autoencoder (Guided-VAE), that is able to learn a controllable generative model by performing latent representation disentanglement learning. The learning objective is achieved by providing signals to the latent encoding/embedding in VAE without changing its main backbone architecture, hence retaining the desirable properties of the VAE. We design an unsupervised strategy and a supervised strategy in Guided-VAE and observe enhanced modeling and controlling capability over the vanilla VAE. In the unsupervised strategy, we guide the VAE learning by introducing a lightweight decoder that learns latent geometric transformation and principal components; in the supervised strategy, we use an adversarial excitation and inhibition mechanism to encourage the disentanglement of the latent variables. Guided-VAE enjoys its transparency and simplicity for the general representation learning task, as well as disentanglement learning. On a number of experiments for representation learning, improved synthesis/sampling, better disentanglement for classification, and reduced classification errors in meta learning have been observed.
more »
« less
Chroma-VAE: Mitigating Shortcut Learning with Generative Classifiers
Deep neural networks are susceptible to shortcut learning, using simple features to achieve low training loss without discovering essential semantic structure. Contrary to prior belief, we show that generative models alone are not sufficient to prevent shortcut learning, despite an incentive to recover a more comprehensive representation of the data than discriminative approaches. However, we observe that shortcuts are preferentially encoded with minimal information, a fact that generative models can exploit to mitigate shortcut learning. In particular, we propose Chroma-VAE, a two-pronged approach where a VAE classifier is initially trained to isolate the shortcut in a small latent subspace, allowing a secondary classifier to be trained on the complementary, shortcut-free latent subspace. In addition to demonstrating the efficacy of Chroma-VAE on benchmark and real-world shortcut learning tasks, our work highlights the potential for manipulating the latent space of generative classifiers to isolate or interpret specific correlations.
more »
« less
- Award ID(s):
- 1922658
- PAR ID:
- 10438108
- Date Published:
- Journal Name:
- NeurIPS
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In this work, we present a new approach for latent system dynamics and remaining useful life (RUL) estimation of complex degrading systems using generative modeling and reinforcement learning. The main contributions of the proposed method are two-fold. First, we show how a deep generative model can approximate the functionality of high-fidelity simulators and, thus, is able to substitute expensive and complex physics-based models with data-driven surrogate ones. In other words, we can use the generative model in lieu of the actual system as a surrogate model of the system. Furthermore, we show how to use such surrogate models for predictive analytics. Our method follows two main steps. First, we use a deep variational autoencoder (VAE) to learn the distribution over the latent state-space that characterizes the dynamics of the system under monitoring. After model training, the probabilistic VAE decoder becomes the surrogate system model. Then, we develop a scalable reinforcement learning framework using the decoder as the environment, to train an agent for identifying adequate approximate values of the latent dynamics, as well as the RUL.more » « less
-
In this work, we present a new approach for latent system dynamics and remaining useful life (RUL) estimation of complex degrading systems using generative modeling and reinforcement learning. The main contributions of the proposed method are two-fold. First, we show how a deep generative model can approximate the functionality of high-fidelity simulators and, thus, is able to substitute expensive and complex physics-based models with data-driven surrogate ones. In other words, we can use the generative model in lieu of the actual system as a surrogate model of the system. Furthermore, we show how to use such surrogate models for predictive analytics. Our method follows two main steps. First, we use a deep variational autoencoder (VAE) to learn the distribution over the latent state-space that characterizes the dynamics of the system under monitoring. After model training, the probabilistic VAE decoder becomes the surrogate system model. Then, we develop a scalable reinforcement learning framework using the decoder as the environment, to train an agent for identifying adequate approximate values of the latent dynamics, as well as the RUL.To our knowledge, the method presented in this paper is the first in industrial prognostics that utilizes generative models and reinforcement learning in that capacity. While the process requires extensive data preprocessing and environment tailored design, which is not always possible, it demonstrates the ability of generative models working in conjunction with reinforcement learning to provide proper value estimations for system dynamics and their RUL. To validate the quality of the proposed method, we conducted numerical experiments using the train_FD002 dataset provided by the NASA CMAPSS data repository. Different subsets were used to train the VAE and the RL agent, and a leftover set was then used for model validation. The results shown prove the merit of our method and will further assist us in developing a data-driven RL environment that incorporates more complex latent dynamic layers, such as normal/faulty operating conditions and hazard processes.more » « less
-
A key advance in learning generative models is the use of amortized inference distributions that are jointly trained with the models. We find that existing training objectives for variational autoencoders can lead to inaccurate amortized inference distributions and, in some cases, improving the objective provably degrades the inference quality. In addition, it has been observed that variational autoencoders tend to ignore the latent variables when combined with a decoding distribution that is too flexible. We again identify the cause in existing training criteria and propose a new class of objectives (Info-VAE) that mitigate these problems. We show that our model can significantly improve the quality of the variational posterior and can make effective use of the latent features regardless of the flexibility of the decoding distribution. Through extensive qualitative and quantitative analyses, we demonstrate that our models outperform competing approaches on multiple performance metricsmore » « less
-
null (Ed.)Distributed devices such as mobile phones can produce and store large amounts of data that can enhance machine learning models; however, this data may contain private information specific to the data owner that prevents the release of the data. We wish to reduce the correlation between user-specific private information and data while maintaining the useful information. Rather than learning a large model to achieve privatization from end to end, we introduce a decoupling of the creation of a latent representation and the privatization of data that allows user-specific privatization to occur in a distributed setting with limited computation and minimal disturbance on the utility of the data. We leverage a Variational Autoencoder (VAE) to create a compact latent representation of the data; however, the VAE remains fixed for all devices and all possible private labels. We then train a small generative filter to perturb the latent representation based on individual preferences regarding the private and utility information. The small filter is trained by utilizing a GAN-type robust optimization that can take place on a distributed device. We conduct experiments on three popular datasets: MNIST, UCI-Adult, and CelebA, and give a thorough evaluation including visualizing the geometry of the latent embeddings and estimating the empirical mutual information to show the effectiveness of our approach.more » « less