skip to main content


Title: Generating Furry Cars: Disentangling Object Shape and Appearance across Multiple Domains
We consider the novel task of learning disentangled representations of object shape and appearance across multiple domains (e.g., dogs and cars). The goal is to learn a generative model that learns an intermediate distribution, which borrows a subset of properties from each domain, enabling the generation of images that did not exist in any domain exclusively. This challenging problem requires an accurate disentanglement of object shape, appearance, and background from each domain, so that the appearance and shape factors from the two domains can be interchanged. We augment an existing approach that can disentangle factors within a single domain but struggles to do so across domains. Our key technical contribution is to represent object appearance with a differentiable histogram of visual features, and to optimize the generator so that two images with the same latent appearance factor but different latent shape factors produce similar histograms. On multiple multi-domain datasets, we demonstrate our method leads to accurate and consistent appearance and shape transfer across domains.  more » « less
Award ID(s):
2150012
NSF-PAR ID:
10320566
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
International Conference on Learning Representations (ICLR)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We propose FineGAN, a novel unsupervised GAN framework, which disentangles the background, object shape, and object appearance to hierarchically generate images of fine-grained object categories. To disentangle the factors without supervision, our key idea is to use information theory to associate each factor to a latent code, and to condition the relationships between the codes in a specific way to induce the desired hierarchy. Through extensive experiments, we show that FineGAN achieves the desired disentanglement to generate realistic and diverse images belonging to fine-grained classes of birds, dogs, and cars. Using FineGAN’s automatically learned features, we also cluster real images as a first attempt at solving the novel problem of unsupervised fine-grained object category discovery. Our code/models/demo can be found at https://github.com/kkanshul/finegan 
    more » « less
  2. null (Ed.)
    The sky exhibits a unique spatial polarization pattern by scattering the unpolarized sun light. Just like insects use this unique angular pattern to navigate, we use it to map pixels to directions on the sky. That is, we show that the unique polarization pattern encoded in the polarimetric appearance of an object captured under the sky can be decoded to reveal the surface normal at each pixel. We derive a polarimetric reflection model of a diffuse plus mirror surface lit by the sun and a clear sky. This model is used to recover the per-pixel surface normal of an object from a single polarimetric image or from multiple polarimetric images captured under the sky at different times of the day. We experimentally evaluate the accuracy of our shape-from-sky method on a number of real objects of different surface compositions. The results clearly show that this passive approach to fine-geometry recovery that fully leverages the unique illumination made by nature is a viable option for 3D sensing. With the advent of quad-Bayer polarization chips, we believe the implications of our method span a wide range of domains. 
    more » « less
  3. We propose FineGAN, a novel unsupervised GAN framework, which disentangles the background, object shape, and object appearance to hierarchically generate images of fine-grained object categories. To disentangle the factors without any supervision, our key idea is to use information theory to associate each factor to a latent code, and to condition the relationships between the codes in a specific way to induce the desired hierarchy. Through extensive experiments, we show that FineGAN achieves the desired disentanglement to generate realistic and diverse images belonging to fine-grained classes of birds, dogs, and cars. Using FineGAN's automatically learned features, we also cluster real images as a first attempt at solving the novel problem of unsupervised fine-grained object category discovery. 
    more » « less
  4. Chiral magnets have recently emerged as hosts for topological spin textures and related transport phenomena, which can find use in next-generation spintronic devices. The coupling between structural chirality and noncollinear magnetism is crucial for the stabilization of complex spin structures such as magnetic skyrmions. Most studies have been focused on the physical properties in homochiral states favored by crystal growth and the absence of long-ranged interactions between domains of opposite chirality. Therefore, effects of the high density of chiral domains and domain boundaries on magnetic states have been rarely explored so far. Herein, we report layered heterochiral Cr1/3TaS2, exhibiting numerous chiral domains forming topological defects and a nanometer-scale helimagnetic order interlocked with the structural chirality. Tuning the chiral domain density, we discovered a macroscopic topological magnetic texture inside each chiral domain that has an appearance of a spiral magnetic superstructure composed of quasiperiodic Néel domain walls. The spirality of this object can have either sign and is decoupled from the structural chirality. In weak, in-plane magnetic fields, it transforms into a nonspiral array of concentric ring domains. Numerical simulations suggest that this magnetic superstructure is stabilized by strains in the heterochiral state favoring noncollinear spins. Our results unveil topological structure/spin couplings in a wide range of different length scales and highly tunable spin textures in heterochiral magnets.

     
    more » « less
  5. We provide an approach to reconstruct spatiotemporal 3D models of aging objects such as fruit containing time-varying shape and appearance using multi-view time-lapse videos captured by a microenvironment of Raspberry Pi cameras. Our approach represents the 3D structure of the object prior to aging using a static 3D mesh reconstructed from multiple photographs of the object captured using a rotating camera track. We manually align the 3D mesh to the images at the first time instant. Our approach automatically deforms the aligned 3D mesh to match the object across the multi-viewpoint time-lapse videos. We texture map the deformed 3D meshes with intensities from the frames at each time instant to create the spatiotemporal 3D model of the object. Our results reveal the time dependence of volume loss due to transpiration and color transformation due to enzymatic browning on banana peels and in exposed parts of bitten fruit. 
    more » « less