skip to main content


Title: A Generalized Neyman-Pearson Criterion for Optimal Domain Adaptation
In the problem of domain adaptation for binary classification, the learner is presented with labeled examples from a source domain, and must correctly classify unlabeled examples from a target domain, which may differ from the source. Previous work on this problem has assumed that the performance measure of interest is the expected value of some loss function. We study a Neyman-Pearson-like criterion and argue that, for this optimality criterion, stronger domain adaptation results are possible than what has previously been established. In particular, we study a class of domain adaptation problems that generalizes both the covariate shift assumption and a model for feature-dependent label noise, and establish optimal classification on the target domain despite not having access to labelled data from this domain.  more » « less
Award ID(s):
1838179
NSF-PAR ID:
10129933
Author(s) / Creator(s):
Date Published:
Journal Name:
Proceedings of the 30th International Conference on Algorithmic Learning Theory, vol. 98 of Proceedings of Machine Learning Research
Page Range / eLocation ID:
738-761
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The assumption that training and testing samples are generated from the same distribution does not always hold for real-world machine-learning applications. The procedure of tackling this discrepancy between the training (source) and testing (target) domains is known as domain adaptation. We propose an unsupervised version of domain adaptation that considers the presence of only unlabelled data in the target domain. Our approach centres on finding correspondences between samples of each domain. The correspondences are obtained by treating the source and target samples as graphs and using a convex criterion to match them. The criteria used are first-order and second-order similarities between the graphs as well as a class-based regularization. We have also developed a computationally efficient routine for the convex optimization, thus allowing the proposed method to be used widely. To verify the effectiveness of the proposed method, computer simulations were conducted on synthetic, image classification and sentiment classification datasets. Results validated that the proposed local sample-to- sample matching method out-performs traditional moment-matching methods and is competitive with respect to current local domain-adaptation methods. 
    more » « less
  2. We address the problem of human action classification in drone videos. Due to the high cost of capturing and labeling large-scale drone videos with diverse actions, we present unsupervised and semi-supervised domain adaptation approaches that leverage both the existing fully annotated action recognition datasets and unannotated (or only a few annotated) videos from drones. To study the emerging problem of drone-based action recognition, we create a new dataset, NEC-DRONE, containing 5,250 videos to evaluate the task. We tackle both problem settings with 1) same and 2) different action label sets for the source (e.g., Kinectics dataset) and target domains (drone videos). We present a combination of video and instance-based adaptation methods, paired with either a classifier or an embedding-based framework to transfer the knowledge from source to target. Our results show that the proposed adaptation approach substantially improves the performance on these challenging and practical tasks. We further demonstrate the applicability of our method for learning cross-view action recognition on the Charades-Ego dataset. We provide qualitative analysis to understand the behaviors of our approaches. 
    more » « less
  3. Elofsson, Arne (Ed.)
    Abstract Motivation Cryoelectron tomography (cryo-ET) visualizes structure and spatial organization of macromolecules and their interactions with other subcellular components inside single cells in the close-to-native state at submolecular resolution. Such information is critical for the accurate understanding of cellular processes. However, subtomogram classification remains one of the major challenges for the systematic recognition and recovery of the macromolecule structures in cryo-ET because of imaging limits and data quantity. Recently, deep learning has significantly improved the throughput and accuracy of large-scale subtomogram classification. However, often it is difficult to get enough high-quality annotated subtomogram data for supervised training due to the enormous expense of labeling. To tackle this problem, it is beneficial to utilize another already annotated dataset to assist the training process. However, due to the discrepancy of image intensity distribution between source domain and target domain, the model trained on subtomograms in source domain may perform poorly in predicting subtomogram classes in the target domain. Results In this article, we adapt a few shot domain adaptation method for deep learning-based cross-domain subtomogram classification. The essential idea of our method consists of two parts: (i) take full advantage of the distribution of plentiful unlabeled target domain data, and (ii) exploit the correlation between the whole source domain dataset and few labeled target domain data. Experiments conducted on simulated and real datasets show that our method achieves significant improvement on cross domain subtomogram classification compared with baseline methods. Availability and implementation Software is available online https://github.com/xulabs/aitom. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  4. null (Ed.)
    Given its demonstrated ability in analyzing and revealing patterns underlying data, Deep Learning (DL) has been increasingly investigated to complement physics-based models in various aspects of smart manufacturing, such as machine condition monitoring and fault diagnosis, complex manufacturing process modeling, and quality inspection. However, successful implementation of DL techniques relies greatly on the amount, variety, and veracity of data for robust network training. Also, the distributions of data used for network training and application should be identical to avoid the internal covariance shift problem that reduces the network performance applicability. As a promising solution to address these challenges, Transfer Learning (TL) enables DL networks trained on a source domain and task to be applied to a separate target domain and task. This paper presents a domain adversarial TL approach, based upon the concepts of generative adversarial networks. In this method, the optimizer seeks to minimize the loss (i.e., regression or classification accuracy) across the labeled training examples from the source domain while maximizing the loss of the domain classifier across the source and target data sets (i.e., maximizing the similarity of source and target features). The developed domain adversarial TL method has been implemented on a 1-D CNN backbone network and evaluated for prediction of tool wear propagation, using NASA's milling dataset. Performance has been compared to other TL techniques, and the results indicate that domain adversarial TL can successfully allow DL models trained on certain scenarios to be applied to new target tasks. 
    more » « less
  5. In this paper we propose a data-driven fault detection framework for semi-supervised scenarios where labeled training data from the system under consideration (the “target”) is imbalanced (e.g. only relatively few labels are available from one of the classes), but data from a related system (the “source”) is readily available. An example of this situation is when a generic simulator is available, but needs to be tuned on a case-by-case basis to match the parameters of the actual system. The goal of this paper is to work with the statistical distribution of the data without necessitating system identification. Our main result shows that if the source and target domain are related by a linear transformation (a common assumption in domain adaptation), the problem of designing a classifier that minimizes a miss-classification loss over the joint source and target domains reduces to a convex optimization subject to a single (non-convex) equality constraint. This second-order equality constraint can be recast as a rank-1 optimization problem, where the rank constraint can be efficiently handled through a reweighted nuclear norm surrogate. These results are illustrated with a practical application: fault detection in additive manufacturing (industrial 3D printing). The proposed method is able to exploit simulation data (source domain) to substantially outperform classifiers tuned using only data from a single domain. 
    more » « less