The classic Generative Adversarial Net and its variants can be roughly categorized into two large families: the unregularized versus regularized GANs. By relaxing the nonparametric assumption on the discriminator in the classic GAN, the regularized GANs have better generalization ability to produce new samples drawn from the real distribution. It is well known that the real data like natural images are not uniformly distributed over the whole data space. Instead, they are often restricted to a lowdimensional manifold of the ambient space. Such a manifold assumption suggests the distance over the manifold should be a better measure to characterize the distinct between real and fake samples. Thus, we define a pullback operator to map samples back to their data manifold, and a manifold margin is defined as the distance between the pullback representations to distinguish between real and fake samples and learn the optimal generators. We justify the effectiveness of the proposed model both theoretically and empirically.
more »
« less
Generalized LossSensitive Adversarial Learning with Manifold Margins
The classic Generative Adversarial Net (GAN) and its variants can be roughly categorized into two large families: the unregularized versus regularized GANs. By relaxing the nonparametric assumption on the discriminator in the classic GAN, the regularized GANs have better generalization ability to produce new samples drawn from the real distribution. Although the regularized GANs have shown compelling performances, there still exist some unaddressed problems. It is well known that the real data like natural images are not uniformly distributed over the whole data space. Instead, they are often restricted to a lowdimensional manifold of the ambient space. Such a manifold assumption suggests the distance over the manifold should be a better measure to characterize the distinct between real and fake samples. Thus, we define a pullback operator to map samples back to their data manifold, and a manifold margin is defined as the distance between the pullback representations to distinguish between real and fake samples and learn the optimal generators. We justify the proposed model from both theoretical and empirical perspectives, demonstrating it can produce high quality images as compared with the other stateoftheart GAN models.
more »
« less
 Award ID(s):
 1704309
 NSFPAR ID:
 10063198
 Date Published:
 Journal Name:
 Proceedings of European Conference on Computer Vision (ECCV 2018)
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


Though generative adversarial networks (GANs) are prominent models to generate realistic and crisp images, they are unstable to train and suffer from the mode collapse problem. The problems of GANs come from approximating the intrinsic discontinuous distribution transform map with continuous DNNs. The recently proposed AEOT model addresses the discontinuity problem by explicitly computing the discontinuous optimal transform map in the latent space of the autoencoder. Though have no mode collapse, the generated images by AEOT are blurry. In this paper, we propose the AEOTGAN model to utilize the advantages of the both models: generate high quality images and at the same time overcome the mode collapse problems. Specifically, we firstly embed the low dimensional image manifold into the latent space by autoencoder (AE). Then the extended semidiscrete optimal transport (SDOT) map is used to generate new latent codes. Finally, our GAN model is trained to generate high quality images from the latent distribution induced by the extended SDOT map. The distribution transform map from this dataset related latent distribution to the data distribution will be continuous, and thus can be well approximated by the continuous DNNs. Additionally, the paired data between the latent codes and the real images gives us further restriction about the generator and stabilizes the training process. Experiments on simple MNIST dataset and complex datasets like CIFAR10 and CelebA show the advantages of the proposed method.more » « less

This paper addresses the mode collapse for generative adversarial networks (GANs). We view modes as a geometric structure of data distribution in a metric space. Under this geometric lens, we embed subsamples of the dataset from an arbitrary metric space into the L2 space, while preserving their pairwise distance distribution. Not only does this metric embedding determine the dimensionality of the latent space automatically, it also enables us to construct a mixture of Gaussians to draw latent space random vectors. We use the Gaussian mixture model in tandem with a simple augmentation of the objective function to train GANs. Every major step of our method is supported by theoretical analysis, and our experiments on real and synthetic data confirm that the generator is able to produce samples spreading over most of the modes while avoiding unwanted samples, outperforming several recent GAN variants on a number of metrics and offering new features.more » « less

This paper addresses the mode collapse for generative adversarial networks (GANs). We view modes as a geometric structure of data distribution in a metric space. Under this geometric lens, we embed subsamples of the dataset from an arbitrary metric space into the L2 space, while preserving their pairwise distance distribution. Not only does this metric embedding determine the dimensionality of the latent space automatically, it also enables us to construct a mixture of Gaussians to draw latent space random vectors. We use the Gaussian mixture model in tandem with a simple augmentation of the objective function to train GANs. Every major step of our method is supported by theoretical analysis, and our experiments on real and synthetic data confirm that the generator is able to produce samples spreading over most of the modes while avoiding unwanted samples, outperforming several recent GAN variants on a number of metrics and offering new features.more » « less

Liu, Jin (Ed.)Generative models rely on the idea that data can be represented in terms of latent variables which are uncorrelated by definition. Lack of correlation among the latent variable support is important because it suggests that the latentspace manifold is simpler to understand and manipulate than the realspace representation. Many types of generative model are used in deep learning,more » « less
e.g ., variational autoencoders (VAEs) and generative adversarial networks (GANs). Based on the idea that the latent space behaves like a vector space Radford et al. (2015), we ask whether we can expand the latent space representation of our data elements in terms of an orthonormal basis set. Here we propose a method to build a set of linearly independent vectors in the latent space of a trained GAN, which we call quasieigenvectors. These quasieigenvectors have two key properties: i) They span the latent space, ii) A set of these quasieigenvectors map to each of the labeled features onetoone. We show that in the case of the MNIST image data set, while the number of dimensions in latent space is large by design, 98% of the data in real space map to a subdomain of latent space of dimensionality equal to the number of labels. We then show how the quasieigenvectors can be used for Latent Spectral Decomposition (LSD). We apply LSD to denoise MNIST images. Finally, using the quasieigenvectors, we construct rotation matrices in latent space which map to feature transformations in real space. Overall, from quasieigenvectors we gain insight regarding the latent space topology.