skip to main content


Title: Learning Disentangled Semantic Representation for Domain Adaptation

Domain adaptation is an important but challenging task. Most of the existing domain adaptation methods struggle to extract the domain-invariant representation on the feature space with entangling domain information and semantic information. Different from previous efforts on the entangled feature space, we aim to extract the domain invariant semantic information in the latent disentangled semantic representation (DSR) of the data. In DSR, we assume the data generation process is controlled by two independent sets of variables, i.e., the semantic latent variables and the domain latent variables. Under the above assumption, we employ a variational auto-encoder to reconstruct the semantic latent variables and domain latent variables behind the data. We further devise a dual adversarial network to disentangle these two sets of reconstructed latent variables. The disentangled semantic latent variables are finally adapted across the domains. Experimental studies testify that our model yields state-of-the-art performance on several domain adaptation benchmark datasets.

 
more » « less
Award ID(s):
1829681
NSF-PAR ID:
10125748
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Page Range / eLocation ID:
2060 to 2066
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Domain adaptation is critical for success in new, unseen environments. Adversarial adaptation models have shown tremendous progress towards adapting to new environments by focusing either on discovering domain invariant representations or by mapping between unpaired image domains. While feature space methods are difficult to interpret and sometimes fail to capture pixel-level and low-level domain shifts, image space methods sometimes fail to incorporate high level semantic knowledge relevant for the end task. We propose a model which adapts between domains using both generative image space alignment and latent representation space alignment. Our approach, Cycle-Consistent Adversarial Domain Adaptation (CyCADA), guides transfer between domains according to a specific discriminatively trained task and avoids divergence by enforcing consistency of the relevant semantics before and after adaptation. We evaluate our method on a variety of visual recognition and prediction settings, including digit classification and semantic segmentation of road scenes, advancing state-of-the-art performance for unsupervised adaptation from synthetic to real world driving domains. 
    more » « less
  2. Given a population longitudinal neuroimaging measurements defined on a brain network, exploiting temporal dependencies within the sequence of data and corresponding latent variables defined on the graph (i.e., network encoding relationships between regions of interest (ROI)) can highly benefit characterizing the brain. Here, it is important to distinguish time-variant (e.g., longitudinal measures) and time-invariant (e.g., gender) components to analyze them individually. For this, we propose an innovative and ground-breaking Disentangled Sequential Graph Autoencoder which leverages the Sequential Variational Autoencoder (SVAE), graph convolution and semi-supervising framework together to learn a latent space composed of time-variant and time-invariant latent variables to characterize disentangled representation of the measurements over the entire ROIs. Incorporating target information in the decoder with a supervised loss let us achieve more effective representation learning towards improved classification. We validate our proposed method on the longitudinal cortical thickness data from Alzheimer’s Disease Neuroimaging Initiative (ADNI) study. Our method outperforms baselines with traditional techniques demonstrating benefits for effective longitudinal data representation for predicting labels and longitudinal data generation. 
    more » « less
  3. Brain-Machine Interfaces (BMIs) have recently emerged as a clinically viable option to restore voluntary movements after paralysis. These devices are based on the ability to extract information about movement intent from neural signals recorded using multi-electrode arrays chronically implanted in the motor cortices of the brain. However, the inherent loss and turnover of recorded neurons requires repeated recalibrations of the interface, which can potentially alter the day-to-day user experience. The resulting need for continued user adaptation interferes with the natural, subconscious use of the BMI. Here, we introduce a new computational approach that decodes movement intent from a low-dimensional latent representation of the neural data. We implement various domain adaptation methods to stabilize the interface over significantly long times. This includes Canonical Correlation Analysis used to align the latent variables across days; this method requires prior point-to-point correspondence of the time series across domains. Alternatively, we match the empirical probability distributions of the latent variables across days through the minimization of their Kullback-Leibler divergence. These two methods provide a significant and comparable improvement in the performance of the interface. However, implementation of an Adversarial Domain Adaptation Network trained to match the empirical probability distribution of the residuals of the reconstructed neural signals outperforms the two methods based on latent variables, while requiring remarkably few data points to solve the domain adaptation problem. 
    more » « less
  4. Domain adaptation techniques using deep neural networks have been mainly used to solve the distribution shift problem in homogeneous domains where data usually share similar feature spaces and have the same dimensionalities. Nevertheless, real world applications often deal with heterogeneous domains that come from completely different feature spaces with different dimensionalities. In our remote sensing application, two remote sensing datasets collected by an active sensor and a passive one are heterogeneous. In particular, CALIOP actively measures each atmospheric column. In this study, 25 measured variables/features that are sensitive to cloud phase are used and they are fully labeled. VIIRS is an imaging radiometer, which collects radiometric measurements of the surface and atmosphere in the visible and infrared bands. Recent studies have shown that passive sensors may have difficulties in prediction cloud/aerosol types in complicated atmospheres (e.g., overlapping cloud and aerosol layers, cloud over snow/ice surface, etc.). To overcome the challenge of the cloud property retrieval in passive sensor, we develop a novel VAE based approach to learn domain invariant representation that capture the spatial pattern from multiple satellite remote sensing data (VDAM), to build a domain invariant cloud property retrieval method to accurately classify different cloud types (labels) in the passive sensing dataset. We further exploit the weight based alignment method on the label space to learn a powerful domain adaptation technique that is pertinent to the remote sensing application. Experiments demonstrate our method outperforms other state-of-the-art machine learning methods and achieves higher accuracy in cloud property retrieval in the passive satellite dataset. 
    more » « less
  5. Morozov, Alexandre V. (Ed.)

    Recent advances in molecular transduction of odorants in the Olfactory Sensory Neurons (OSNs) of theDrosophilaAntenna have shown that theodorant object identityis multiplicatively coupled with theodorant concentration waveform. The resulting combinatorial neural code is a confounding representation of odorant semantic information (identity) and syntactic information (concentration). To distill the functional logic of odor information processing in the Antennal Lobe (AL) a number of challenges need to be addressed including 1) how is the odorantsemantic informationdecoupled from thesyntactic informationat the level of the AL, 2) how are these two information streams processed by the diverse AL Local Neurons (LNs) and 3) what is the end-to-end functional logic of the AL?

    By analyzing single-channel physiology recordings at the output of the AL, we found that the Projection Neuron responses can be decomposed into aconcentration-invariantcomponent, and two transient components boosting the positive/negative concentration contrast that indicate onset/offset timing information of the odorant object. We hypothesized that the concentration-invariant component, in the multi-channel context, is the recovered odorant identity vector presented between onset/offset timing events.

    We developed a model of LN pathways in the Antennal Lobe termed the differential Divisive Normalization Processors (DNPs), which robustly extract thesemantics(the identity of the odorant object) and the ON/OFF semantic timing events indicating the presence/absence of an odorant object. For real-time processing with spiking PN models, we showed that the phase-space of the biological spike generator of the PN offers an intuit perspective for the representation of recovered odorant semantics and examined the dynamics induced by the odorant semantic timing events. Finally, we provided theoretical and computational evidence for the functional logic of the AL as a robustON-OFF odorant object identity recovery processoracross odorant identities, concentration amplitudes and waveform profiles.

     
    more » « less