skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on May 17, 2025

Title: Learning Gaze-aware Compositional GAN from Limited Annotations
Gaze-annotated facial data is crucial for training deep neural networks (DNNs) for gaze estimation. However, obtaining these data is labor-intensive and requires specialized equipment due to the challenge of accurately annotating the gaze direction of a subject. In this work, we present a generative framework to create annotated gaze data by leveraging the benefits of labeled and unlabeled data sources. We propose a Gaze-aware Compositional GAN that learns to generate annotated facial images from a limited labeled dataset. Then we transfer this model to an unlabeled data domain to take advantage of the diversity it provides. Experiments demonstrate our approach's effectiveness in generating within-domain image augmentations in the ETH-XGaze dataset and cross-domain augmentations in the CelebAMask-HQ dataset domain for gaze estimation DNN training. We also show additional applications of our work, which include facial image editing and gaze redirection.  more » « less
Award ID(s):
2239688
PAR ID:
10575119
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
ACM New York, NY, USA
Date Published:
Journal Name:
Proceedings of the ACM on Computer Graphics and Interactive Techniques
Volume:
7
Issue:
2
ISSN:
2577-6193
Page Range / eLocation ID:
1 to 17
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. 3D instance segmentation for unlabeled imaging modalities is a challenging but essential task as collecting expert annotation can be expensive and time-consuming. Existing works segment a new modality by either deploying pre-trained models optimized on diverse training data or sequentially conducting image translation and segmentation with two relatively independent networks. In this work, we propose a novel Cyclic Segmentation Generative Adversarial Network (CySGAN) that conducts image translation and instance segmentation simultaneously using a unified network with weight sharing. Since the image translation layer can be removed at inference time, our proposed model does not introduce additional computational cost upon a standard segmentation model. For optimizing CySGAN, besides the CycleGAN losses for image translation and supervised losses for the annotated source domain, we also utilize self-supervised and segmentation-based adversarial objectives to enhance the model performance by leveraging unlabeled target domain images. We benchmark our approach on the task of 3D neuronal nuclei segmentation with annotated electron microscopy (EM) images and unlabeled expansion microscopy (ExM) data. The proposed CySGAN outperforms pre-trained generalist models, feature-level domain adaptation models, and the baselines that conduct image translation and segmentation sequentially. Our implementation and the newly collected, densely annotated ExM zebrafish brain nuclei dataset, named NucExM, are publicly available at https://connectomics-bazaar.github.io/proj/CySGAN/index.html. 
    more » « less
  2. Image segmentation is an essential step in biomedical image analysis. In recent years, deep learning models have achieved signi cant success in segmentation. However, deep learning requires the availability of large annotated data to train these models, which can be challenging in biomedical imaging domain. In this paper, we aim to accomplish biomedical image segmentation with limited labeled data using active learning. We present a deep active learning framework that selects additional data points to be annotated by combining U-Net with an efficient and effective query strategy to capture the most uncertain and representative points. This algorithm decouples the representative part by first finding the core points in the unlabeled pool and then selecting the most uncertain points from the reduced pool, which are different from the labeled pool. In our experiment, only 13% of the dataset was required with active learning to outperform the model trained on the entire 2018 MIC- CAI Brain Tumor Segmentation (BraTS) dataset. Thus, active learning reduced the amount of labeled data required for image segmentation without a signi cant loss in the accuracy. 
    more » « less
  3. Elofsson, Arne (Ed.)
    Abstract Motivation Cryoelectron tomography (cryo-ET) visualizes structure and spatial organization of macromolecules and their interactions with other subcellular components inside single cells in the close-to-native state at submolecular resolution. Such information is critical for the accurate understanding of cellular processes. However, subtomogram classification remains one of the major challenges for the systematic recognition and recovery of the macromolecule structures in cryo-ET because of imaging limits and data quantity. Recently, deep learning has significantly improved the throughput and accuracy of large-scale subtomogram classification. However, often it is difficult to get enough high-quality annotated subtomogram data for supervised training due to the enormous expense of labeling. To tackle this problem, it is beneficial to utilize another already annotated dataset to assist the training process. However, due to the discrepancy of image intensity distribution between source domain and target domain, the model trained on subtomograms in source domain may perform poorly in predicting subtomogram classes in the target domain. Results In this article, we adapt a few shot domain adaptation method for deep learning-based cross-domain subtomogram classification. The essential idea of our method consists of two parts: (i) take full advantage of the distribution of plentiful unlabeled target domain data, and (ii) exploit the correlation between the whole source domain dataset and few labeled target domain data. Experiments conducted on simulated and real datasets show that our method achieves significant improvement on cross domain subtomogram classification compared with baseline methods. Availability and implementation Software is available online https://github.com/xulabs/aitom. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  4. The size of image volumes in connectomics studies now reaches terabyte and often petabyte scales with a great diversity of appearance due to different sample preparation procedures. However, manual annotation of neuronal structures (e.g., synapses) in these huge image volumes is time-consuming, leading to limited labeled training data often smaller than 0.001% of the large-scale image volumes in application. Methods that can utilize in-domain labeled data and generalize to out-of-domain unlabeled data are in urgent need. Although many domain adaptation approaches are proposed to address such issues in the natural image domain, few of them have been evaluated on connectomics data due to a lack of domain adaptation benchmarks. Therefore, to enable developments of domain adaptive synapse detection methods for large-scale connectomics applications, we annotated 14 image volumes from a biologically diverse set of Megaphragma viggianii brain regions originating from three different whole-brain datasets and organized the WASPSYN challenge at ISBI 2023. The annotations include coordinates of pre-synapses and post-synapses in the 3D space, together with their one-to-many connectivity information. This paper describes the dataset, the tasks, the proposed baseline, the evaluation method, and the results of the challenge. Limitations of the challenge and the impact on neuroscience research are also discussed. The challenge is and will continue to be available at https://codalab.lisn.upsaclay.fr/competitions/9169. Successful algorithms that emerge from our challenge may potentially revolutionize real-world connectomics research and further the cause that aims to unravel the complexity of brain structure and function. 
    more » « less
  5. null (Ed.)
    We propose a semi-supervised learning approach for video classification, VideoSSL, using convolutional neural networks (CNN). Like other computer vision tasks, existing supervised video classification methods demand a large amount of labeled data to attain good performance. However, annotation of a large dataset is expensive and time consuming. To minimize the dependence on a large annotated dataset, our proposed semi-supervised method trains from a small number of labeled examples and exploits two regulatory signals from unlabeled data. The first signal is the pseudo-labels of unlabeled examples computed from the confidences of the CNN being trained. The other is the normalized probabilities, as predicted by an image classifier CNN, that captures the information about appearances of the interesting objects in the video. We show that, under the supervision of these guiding signals from unlabeled examples, a video classification CNN can achieve impressive performances utilizing a small fraction of annotated examples on three publicly available datasets: UCF101, HMDB51, and Kinetics. 
    more » « less