skip to main content


Title: Model-Agnostic Structural Transfer Learning for Cross-Domain Autonomous Activity Recognition
Activity recognition using data collected with smart devices such as mobile and wearable sensors has become a critical component of many emerging applications ranging from behavioral medicine to gaming. However, an unprecedented increase in the diversity of smart devices in the internet-of-things era has limited the adoption of activity recognition models for use across different devices. This lack of cross-domain adaptation is particularly notable across sensors of different modalities where the mapping of the sensor data in the traditional feature level is highly challenging. To address this challenge, we propose ActiLabel, a combinatorial framework that learns structural similarities among the events that occur in a target domain and those of a source domain and identifies an optimal mapping between the two domains at their structural level. The structural similarities are captured through a graph model, referred to as the dependency graph, which abstracts details of activity patterns in low-level signal and feature space. The activity labels are then autonomously learned in the target domain by finding an optimal tiered mapping between the dependency graphs. We carry out an extensive set of experiments on three large datasets collected with wearable sensors involving human subjects. The results demonstrate the superiority of ActiLabel over state-of-the-art transfer learning and deep learning methods. In particular, ActiLabel outperforms such algorithms by average F1-scores of 36.3%, 32.7%, and 9.1% for cross-modality, cross-location, and cross-subject activity recognition, respectively.  more » « less
Award ID(s):
2210133
NSF-PAR ID:
10527682
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
MDPI
Date Published:
Journal Name:
Sensors
Volume:
23
Issue:
14
ISSN:
1424-8220
Page Range / eLocation ID:
6337
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Human activity recognition (HAR) from wearable sensors data has become ubiquitous due to the widespread proliferation of IoT and wearable devices. However, recognizing human activity in heterogeneous environments, for example, with sensors of different models and make, across different persons and their on-body sensor placements introduces wide range discrepancies in the data distributions, and therefore, leads to an increased error margin. Transductive transfer learning techniques such as domain adaptation have been quite successful in mitigating the domain discrepancies between the source and target domain distributions without the costly target domain data annotations. However, little exploration has been done when multiple distinct source domains are present, and the optimum mapping to the target domain from each source is not apparent. In this paper, we propose a deep Multi-Source Adversarial Domain Adaptation (MSADA) framework that opportunistically helps select the most relevant feature representations from multiple source domains and establish such mappings to the target domain by learning the perplexity scores. We showcase that the learned mappings can actually reflect our prior knowledge on the semantic relationships between the domains, indicating that MSADA can be employed as a powerful tool for exploratory activity data analysis. We empirically demonstrate that our proposed multi-source domain adaptation approach achieves 2% improvement with OPPORTUNITY dataset (cross-person heterogeneity, 4 ADLs), whereas 13% improvement on DSADS dataset (cross-position heterogeneity, 10 ADLs and sports activities). 
    more » « less
  2. null (Ed.)
    Recent years have witnessed a growing body of research on autonomous activity recognition models for use in deployment of mobile systems in new settings such as when a wearable system is adopted by a new user. Current research, however, lacks comprehensive frameworks for transfer learning. Specifically, it lacks the ability to deal with partially available data in new settings. To address these limitations, we propose {\it OptiMapper}, a novel uninformed cross-subject transfer learning framework for activity recognition. OptiMapper is a combinatorial optimization framework that extracts abstract knowledge across subjects and utilizes this knowledge for developing a personalized and accurate activity recognition model in new subjects. To this end, a novel community-detection-based clustering of unlabeled data is proposed that uses the target user data to construct a network of unannotated sensor observations. The clusters of these target observations are then mapped onto the source clusters using a complete bipartite graph model. In the next step, the mapped labels are conditionally fused with the prediction of a base learner to create a personalized and labeled training dataset for the target user. We present two instantiations of OptiMapper. The first instantiation, which is applicable for transfer learning across domains with identical activity labels, performs a one-to-one bipartite mapping between clusters of the source and target users. The second instantiation performs optimal many-to-one mapping between the source clusters and those of the target. The many-to-one mapping allows us to find an optimal mapping even when the target dataset does not contain sufficient instances of all activity classes. We show that this type of cross-domain mapping can be formulated as a transportation problem and solved optimally. We evaluate our transfer learning techniques on several activity recognition datasets. Our results show that the proposed community detection approach can achieve, on average, 69%$ utilization of the datasets for clustering with an overall clustering accuracy of 87.5%. Our results also suggest that the proposed transfer learning algorithms can achieve up to 22.5% improvement in the activity recognition accuracy, compared to the state-of-the-art techniques. The experimental results also demonstrate high and sustained performance even in presence of partial data. 
    more » « less
  3. null (Ed.)
    Current research in the recognition of American Sign Language (ASL) has focused on perception using video or wearable gloves. However, deaf ASL users have expressed concern about the invasion of privacy with video, as well as the interference with daily activity and restrictions on movement presented by wearable gloves. In contrast, RF sensors can mitigate these issues as it is a non-contact ambient sensor that is effective in the dark and can penetrate clothes, while only recording speed and distance. Thus, this paper investigates RF sensing as an alternative sensing modality for ASL recognition to facilitate interactive devices and smart environments for the deaf and hard-of-hearing. In particular, the recognition of up to 20 ASL signs, sequential classification of signing mixed with daily activity, and detection of a trigger sign to initiate human-computer interaction (HCI) via RF sensors is presented. Results yield %91.3 ASL word-level classification accuracy, %92.3 sequential recognition accuracy, 0.93 trigger recognition rate. 
    more » « less
  4. We explore the effect of auxiliary labels in improving the classification accuracy of wearable sensor-based human activity recognition (HAR) systems, which are primarily trained with the supervision of the activity labels (e.g. running, walking, jumping). Supplemental meta-data are often available during the data collection process such as body positions of the wearable sensors, subjects' demographic information (e.g. gender, age), and the type of wearable used (e.g. smartphone, smart-watch). This information, while not directly related to the activity classification task, can nonetheless provide auxiliary supervision and has the potential to significantly improve the HAR accuracy by providing extra guidance on how to handle the introduced sample heterogeneity from the change in domains (i.e positions, persons, or sensors), especially in the presence of limited activity labels. However, integrating such meta-data information in the classification pipeline is non-trivial - (i) the complex interaction between the activity and domain label space is hard to capture with a simple multi-task and/or adversarial learning setup, (ii) meta-data and activity labels might not be simultaneously available for all collected samples. To address these issues, we propose a novel framework Conditional Domain Embeddings (CoDEm). From the available unlabeled raw samples and their domain meta-data, we first learn a set of domain embeddings using a contrastive learning methodology to handle inter-domain variability and inter-domain similarity. To classify the activities, CoDEm then learns the label embeddings in a contrastive fashion, conditioned on domain embeddings with a novel attention mechanism, enforcing the model to learn the complex domain-activity relationships. We extensively evaluate CoDEm in three benchmark datasets against a number of multi-task and adversarial learning baselines and achieve state-of-the-art performance in each avenue. 
    more » « less
  5. Human activity recognition (HAR) from wearable sensor data has recently gained widespread adoption in a number of fields. However, recognizing complex human activities, postural and rhythmic body movements (e.g., dance, sports) is challenging due to the lack of domain-specific labeling information, the perpetual variability in human movement kinematics profiles due to age, sex, dexterity and the level of professional training. In this paper, we propose a deep activity recognition model to work with limited labeled data, both for simple and complex human activities. To mitigate the intra- and inter-user spatio-temporal variability of movements, we posit novel data augmentation and domain normalization techniques. We depict a semi-supervised technique that learns noise and transformation invariant feature representation from sparsely labeled data to accommodate intra-personal and inter-user variations of human movement kinematics. We also postulate a transfer learning approach to learn domain invariant feature representations by minimizing the feature distribution distance between the source and target domains. We showcase the improved performance of our proposed framework, AugToAct, using a public HAR dataset. We also design our own data collection, annotation and experimental setup on complex dance activity recognition steps and kinematics movements where we achieved higher performance metrics with limited label data compared to simple activity recognition tasks. 
    more » « less