skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Domain Adaptation Methods for Lab-to-Field Human Context Recognition
Human context recognition (HCR) using sensor data is a crucial task in Context-Aware (CA) applications in domains such as healthcare and security. Supervised machine learning HCR models are trained using smartphone HCR datasets that are scripted or gathered in-the-wild. Scripted datasets are most accurate because of their consistent visit patterns. Supervised machine learning HCR models perform well on scripted datasets but poorly on realistic data. In-the-wild datasets are more realistic, but cause HCR models to perform worse due to data imbalance, missing or incorrect labels, and a wide variety of phone placements and device types. Lab-to-field approaches learn a robust data representation from a scripted, high-fidelity dataset, which is then used for enhancing performance on a noisy, in-the-wild dataset with similar labels. This research introduces Triplet-based Domain Adaptation for Context REcognition (Triple-DARE), a lab-to-field neural network method that combines three unique loss functions to enhance intra-class compactness and inter-class separation within the embedding space of multi-labeled datasets: (1) domain alignment loss in order to learn domain-invariant embeddings; (2) classification loss to preserve task-discriminative features; and (3) joint fusion triplet loss. Rigorous evaluations showed that Triple-DARE achieved 6.3% and 4.5% higher F1-score and classification, respectively, than state-of-the-art HCR baselines and outperformed non-adaptive HCR models by 44.6% and 10.7%, respectively.  more » « less
Award ID(s):
2021871
PAR ID:
10432335
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Sensors
Volume:
23
Issue:
6
ISSN:
1424-8220
Page Range / eLocation ID:
3081
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Rubin, Stuart; Chen, Shu-Ching (Ed.)
    In this work, we use an unsupervised method for generating binary class labels in a novel context to create class labels for Medicare fraud detection. We examine how class imbalance influences the quality of these new labels and how it affects supervised classification. We use four different Medicare Part D fraud detection datasets, with the largest containing over 5 million instances. The other three datasets are sampled from the original dataset. Using Random Under-Sampling (RUS), we subsample from the majority class of the original data to produce three datasets with varying levels of class imbalance. To evaluate the performance of the newly created labels, we train a supervised classifier and evaluate its classification performance and compare it to an unsupervised anomaly detection method as a baseline. Our empirical findings indicate that the generated class labels are of high enough quality and enable effective supervised classifier training for fraud detection. Additionally, supervised classification with the new labels consistently outperforms the baseline used for comparison across all test scenarios. Further more, we observe an inverse relationship between class imbalance in the dataset and classifier performance, with AUPRC scores improving as the training dataset becomes more balanced. This work not only validates the efficacy of the synthesized class labels in labeling Medicare fraud but also shows its robustness across different degrees of class imbalance. 
    more » « less
  2. Shekhar, Shashi; Zhou, Zhi-Hua; Chiang, Yao-Yi; Stiglic, Gregor (Ed.)
    Creating separable representations via representation learning and clustering is critical in analyzing large unstructured datasets with only a few labels. Separable representations can lead to supervised models with better classification capabilities and additionally aid in generating new labeled samples. Most unsupervised and semisupervised methods to analyze large datasets do not leverage the existing small amounts of labels to get better representations. In this paper, we propose a spatiotemporal clustering paradigm that uses spatial and temporal features combined with a constrained loss to produce separable representations. We show the working of this method on the newly published dataset ReaLSAT, a dataset of surface water dynamics for over 680,000 lakes across the world, making it an essential dataset in terms of ecology and sustainability. Using this large unlabelled dataset, we first show how a spatiotemporal representation is better compared to just spatial or temporal representation. We then show how we can learn even better representations using a constrained loss with few labels. We conclude by showing how our method, using few labels, can pick out new labeled samples from the unlabeled data, which can be used to augment supervised methods leading to better classification. 
    more » « less
  3. Abstract Text classification is a widely studied problem and has broad applications. In many real-world problems, the number of texts for training classification models is limited, which renders these models prone to overfitting. To address this problem, we propose SSL-Reg, a data-dependent regularization approach based on self-supervised learning (SSL). SSL (Devlin et al., 2019a) is an unsupervised learning approach that defines auxiliary tasks on input data without using any human-provided labels and learns data representations by solving these auxiliary tasks. In SSL-Reg, a supervised classification task and an unsupervised SSL task are performed simultaneously. The SSL task is unsupervised, which is defined purely on input texts without using any human- provided labels. Training a model using an SSL task can prevent the model from being overfitted to a limited number of class labels in the classification task. Experiments on 17 text classification datasets demonstrate the effectiveness of our proposed method. Code is available at https://github.com/UCSD-AI4H/SSReg. 
    more » « less
  4. Speech emotion recognition (SER) is a challenging task due to the limited availability of real-world labeled datasets. Since it is easier to find unlabeled data, the use of self-supervised learning (SSL) has become an attractive alternative. This study proposes new pre-text tasks for SSL to improve SER. While our target application is SER, the proposed pre-text tasks include audio-visual formulations, leveraging the relationship between acoustic and facial features. Our proposed approach introduces three new unimodal and multimodal pre-text tasks that are carefully designed to learn better representations for predicting emotional cues from speech. Task 1 predicts energy variations (high or low) from a speech sequence. Task 2 uses speech features to predict facial activation (high or low) based on facial landmark movements. Task 3 performs a multi-class emotion recognition task on emotional labels obtained from combinations of action units (AUs) detected across a video sequence. We pre-train a network with 60.92 hours of unlabeled data, fine-tuning the model for the downstream SER task. The results on the CREMA-D dataset show that the model pre-trained on the proposed domain-specific pre-text tasks significantly improves the precision (up to 5.1%), recall (up to 4.5%), and F1-scores (up to 4.9%) of our SER system. 
    more » « less
  5. Learning from label proportions (LLP) is a weakly supervised classification problem where data points are grouped into bags, and the label proportions within each bag are observed instead of the instance-level labels. The task is to learn a classifier to predict the labels of future individual instances. Prior work on LLP for multi-class data has yet to develop a theoretically grounded algorithm. In this work, we propose an approach to LLP based on a reduction to learning with label noise, using the forward correction (FC) loss of Patrini et al. [30]. We establish an excess risk bound and generalization error analysis for our approach, while also extending the theory of the FC loss which may be of independent interest. Our approach demonstrates improved empirical performance in deep learning scenarios across multiple datasets and architectures, compared to the leading methods. 
    more » « less