skip to main content


Title: Bootstrapping Conditional GANs for Video Game Level Generation
Generative Adversarial Networks (GANs) have shown impressive results for image generation. However, GANs face challenges in generating contents with certain types of constraints, such as game levels. Specifically, it is difficult to generate levels that have aesthetic appeal and are playable at the same time. Additionally, because training data usually is limited, it is challenging to generate unique levels with current GANs. In this paper, we propose a new GAN architecture named Conditional Embedding Self-Attention Generative Adversarial Net- work (CESAGAN) and a new bootstrapping training procedure. The CESAGAN is a modification of the self-attention GAN that incorporates an embedding feature vector input to condition the training of the discriminator and generator. This allows the network to model non-local dependency between game objects, and to count objects. Additionally, to reduce the number of levels necessary to train the GAN, we propose a bootstrapping mechanism in which playable generated levels are added to the training set. The results demonstrate that the new approach does not only generate a larger number of levels that are playable but also generates fewer duplicate levels compared to a standard GAN.  more » « less
Award ID(s):
1717324
NSF-PAR ID:
10231876
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
IEEE Conference on Games
Page Range / eLocation ID:
41 to 48
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Security monitoring is crucial for maintaining a strong IT infrastructure by protecting against emerging threats, identifying vulnerabilities, and detecting potential points of failure. It involves deploying advanced tools to continuously monitor networks, systems, and configurations. However, organizations face challenges in adapting modern techniques like Machine Learning (ML) due to privacy and security risks associated with sharing internal data. Compliance with regulations like GDPR further complicates data sharing. To promote external knowledge sharing, a secure and privacy-preserving method for organizations to share data is necessary. Privacy-preserving data generation involves creating new data that maintains privacy while preserving key characteristics and properties of the original data so that it is still useful in creating downstream models of attacks. Generative models, such as Generative Adversarial Networks (GAN), have been proposed as a solution for privacy preserving synthetic data generation. However, standard GANs are limited in their capabilities to generate realistic system data. System data have inherent constraints, e.g., the list of legitimate I.P. addresses and port numbers are limited, and protocols dictate a valid sequence of network events. Standard generative models do not account for such constraints and do not utilize domain knowledge in their generation process. Additionally, they are limited by the attribute values present in the training data. This poses a major privacy risk, as sensitive discrete attribute values are repeated by GANs. To address these limitations, we propose a novel model for Knowledge Infused Privacy Preserving Data Generation. A privacy preserving Generative Adversarial Network (GAN) is trained on system data for generating synthetic datasets that can replace original data for downstream tasks while protecting sensitive data. Knowledge from domain-specific knowledge graphs is used to guide the data generation process, check for the validity of generated values, and enrich the dataset by diversifying the values of attributes. We specifically demonstrate this model by synthesizing network data captured by the network capture tool, Wireshark. We establish that the synthetic dataset holds up to the constraints of the network-specific datasets and can replace the original dataset in downstream tasks. 
    more » « less
  2. Though generative adversarial networks (GANs) are prominent models to generate realistic and crisp images, they are unstable to train and suffer from the mode collapse problem. The problems of GANs come from approximating the intrinsic discontinuous distribution transform map with continuous DNNs. The recently proposed AE-OT model addresses the discontinuity problem by explicitly computing the discontinuous optimal transform map in the latent space of the autoencoder. Though have no mode collapse, the generated images by AE-OT are blurry. In this paper, we propose the AE-OT-GAN model to utilize the advantages of the both models: generate high quality images and at the same time overcome the mode collapse problems. Specifically, we firstly embed the low dimensional image manifold into the latent space by autoencoder (AE). Then the extended semi-discrete optimal transport (SDOT) map is used to generate new latent codes. Finally, our GAN model is trained to generate high quality images from the latent distribution induced by the extended SDOT map. The distribution transform map from this dataset related latent distribution to the data distribution will be continuous, and thus can be well approximated by the continuous DNNs. Additionally, the paired data between the latent codes and the real images gives us further restriction about the generator and stabilizes the training process. Experiments on simple MNIST dataset and complex datasets like CIFAR10 and CelebA show the advantages of the proposed method. 
    more » « less
  3. Generative Adversarial Networks (GANs) have recently attracted considerable attention in the AI community due to their ability to generate high-quality data of significant statisti- cal resemblance to real data. Fundamentally, GAN is a game between two neural networks trained in an adversarial manner to reach a zero-sum Nash equilibrium profile. Despite the improvement accomplished in GANs in the last few years, several issues remain to be solved. This paper reviews the literature on the game-theoretic aspects of GANs and addresses how game theory models can address specific challenges of generative models and improve the GAN’s performance. We first present some preliminaries, including the basic GAN model and some game theory background. We then present a taxonomy to clas- sify state-of-the-art solutions into three main categories: modified game models, modified architectures, and modified learning methods. The classification is based on modifications made to the basic GAN model by proposed game-theoretic approaches in the literature. We then explore the objectives of each category and discuss recent works in each class. Finally, we discuss the remaining challenges in this field and present future research directions. 
    more » « less
  4. A Generative Adversarial Network (GAN) is an unsupervised generative framework to generate a sample distribution that is identical to the data distribution. Recently, mix strategy multi-generator/discriminator GANs have been shown to outperform single pair GANs. However, the mixed model suffers from the problem of linearly growing training time. Also, imbalanced training among generators makes it difficult to parallelize. In this paper, we propose a balanced mix-generator GAN that works in parallel by mixing multiple disjoint generators to approximate the real distribution. The weights of the discriminator and the classifier are controlled by a balance strategy. We also present an efficient loss function, to force each generator to embrace few modes with a high probability. Our model is naturally adaptive to large parallel computation frameworks. Each generator can be trained on multiple GPUs asynchronously. We have performed extensive experiments on synthetic datasets, MNIST1000, CIFAR-10, and ImageNet. The results establish that our model can achieve the state-of-the-art performance (in terms of the modes coverage and the inception score), with significantly reduced training time. We also show that the missing mode problem can be relieved with a growing number of generators. 
    more » « less
  5. Abstract Objective

    Electronic medical records (EMRs) can support medical research and discovery, but privacy risks limit the sharing of such data on a wide scale. Various approaches have been developed to mitigate risk, including record simulation via generative adversarial networks (GANs). While showing promise in certain application domains, GANs lack a principled approach for EMR data that induces subpar simulation. In this article, we improve EMR simulation through a novel pipeline that (1) enhances the learning model, (2) incorporates evaluation criteria for data utility that informs learning, and (3) refines the training process.

    Materials and Methods

    We propose a new electronic health record generator using a GAN with a Wasserstein divergence and layer normalization techniques. We designed 2 utility measures to characterize similarity in the structural properties of real and simulated EMRs in the original and latent space, respectively. We applied a filtering strategy to enhance GAN training for low-prevalence clinical concepts. We evaluated the new and existing GANs with utility and privacy measures (membership and disclosure attacks) using billing codes from over 1 million EMRs at Vanderbilt University Medical Center.

    Results

    The proposed model outperformed the state-of-the-art approaches with significant improvement in retaining the nature of real records, including prediction performance and structural properties, without sacrificing privacy. Additionally, the filtering strategy achieved higher utility when the EMR training dataset was small.

    Conclusions

    These findings illustrate that EMR simulation through GANs can be substantially improved through more appropriate training, modeling, and evaluation criteria.

     
    more » « less