TiGAN: Text-Based Interactive Image Generation and Manipulation

Zhou, Yufan; Zhang, Ruiyi; Gu, Jiuxiang; Tensmeyer, Chris; Yu, Tong; Chen, Changyou; Xu, Jinhui; Sun, Tong

doi:10.1609/aaai.v36i3.20270

Citation Details

TiGAN: Text-Based Interactive Image Generation and Manipulation

Using natural-language feedback to guide image generation and manipulation can greatly lower the required efforts and skills. This topic has received increased attention in recent years through refinement of Generative Adversarial Networks (GANs); however, most existing works are limited to single-round interaction, which is not reflective of real world interactive image editing workflows. Furthermore, previous works dealing with multi-round scenarios are limited to predefined feedback sequences, which is also impractical. In this paper, we propose a novel framework for Text-based Interactive image generation and manipulation (TiGAN) that responds to users' natural-language feedback. TiGAN utilizes the powerful pre-trained CLIP model to understand users' natural-language feedback and exploits contrastive learning for a better text-to-image mapping. To maintain the image consistency during interactions, TiGAN generates intermediate feature vectors aligned with the feedback and selectively feeds these vectors to our proposed generative model. Empirical results on several datasets show that TiGAN improves both interaction efficiency and image quality while better avoids undesirable image manipulation during interactions. more »

Award ID(s):: 1910492

PAR ID:: 10351123

Author(s) / Creator(s):: Zhou, Yufan; Zhang, Ruiyi; Gu, Jiuxiang; Tensmeyer, Chris; Yu, Tong; Chen, Changyou; Xu, Jinhui; Sun, Tong

Date Published:: 2022-06-30

Journal Name:: Proceedings of the AAAI Conference on Artificial Intelligence

Volume:: 36

Issue:: 3

ISSN:: 2159-5399

Page Range / eLocation ID:: 3580 to 3588

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1609/aaai.v36i3.20270

More Like this