skip to main content


Title: Active Deep Learning for Activity Recognition with Context Aware Annotator Selection
Machine learning models are bounded by the credibility of ground truth data used for both training and testing. Regardless of the problem domain, this ground truth annotation is objectively manual and tedious as it needs considerable amount of human intervention. With the advent of Active Learning with multiple annotators, the burden can be somewhat mitigated by actively acquiring labels of most informative data instances. However, multiple annotators with varying degrees of expertise poses new set of challenges in terms of quality of the label received and availability of the annotator. Due to limited amount of ground truth information addressing the variabilities of Activity of Daily Living (ADLs), activity recognition models using wearable and mobile devices are still not robust enough for real-world deployment. In this paper, we propose an active learning combined deep model which updates its network parameters based on the optimization of a joint loss function. We then propose a novel annotator selection model by exploiting the relationships among the users while considering their heterogeneity with respect to their expertise, physical and spatial context. Our proposed model leverages model-free deep reinforcement learning in a partially observable environment setting to capture the actionreward interaction among multiple annotators. Our experiments in real-world settings exhibit that our active deep model converges to optimal accuracy with fewer labeled instances and achieves 8% improvement in accuracy in fewer iterations.  more » « less
Award ID(s):
1750936
NSF-PAR ID:
10144109
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of the 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Anchorage, Alaska, August 2019
Page Range / eLocation ID:
1862 to 1870
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Crowdsourcing provides a practical way to obtain large amounts of labeled data at a low cost. However, the annotation quality of annotators varies considerably, which imposes new challenges in learning a high-quality model from the crowdsourced annotations. In this work, we provide a new perspective to decompose annotation noise into common noise and individual noise and differentiate the source of confusion based on instance difficulty and annotator expertise on a per-instance-annotator basis. We realize this new crowdsourcing model by an end-to-end learning solution with two types of noise adaptation layers: one is shared across annotators to capture their commonly shared confusions, and the other one is pertaining to each annotator to realize individual confusion. To recognize the source of noise in each annotation, we use an auxiliary network to choose from the two noise adaptation layers with respect to both instances and annotators. Extensive experiments on both synthesized and real-world benchmarks demonstrate the effectiveness of our proposed common noise adaptation solution. 
    more » « less
  2. Using noisy crowdsourced labels from multiple annotators, a deep learning-based end-to-end (E2E) system aims to learn the label correction mechanism and the neural classifier simultaneously. To this end, many E2E systems concatenate the neural classifier with multiple annotator-specific label confusion layers and co-train the two parts in a parameter-coupled manner. The formulated coupled cross-entropy minimization (CCEM)-type criteria are intuitive and work well in practice. Nonetheless, theoretical understanding of the CCEM criterion has been limited. The contribution of this work is twofold: First, performance guarantees of the CCEM criterion are presented. Our analysis reveals for the first time that the CCEM can indeed correctly identify the annotators' confusion characteristics and the desired ``ground-truth'' neural classifier under realistic conditions, e.g., when only incomplete annotator labeling and finite samples are available. Second, based on the insights learned from our analysis, two regularized variants of the CCEM are proposed. The regularization terms provably enhance the identifiability of the target model parameters in various more challenging cases. A series of synthetic and real data experiments are presented to showcase the effectiveness of our approach. 
    more » « less
  3. Precise and eloquent label information is fundamental for interpreting the underlying data distributions distinctively and training of supervised and semi-supervised learning models adequately. But obtaining large amount of labeled data demands substantial manual effort. This obligation can be mitigated by acquiring labels of most informative data instances using Active Learning. However labels received from humans are not always reliable and poses the risk of introducing noisy class labels which will degrade the efficacy of a model instead of its improvement. In this paper, we address the problem of annotating sensor data instances of various Activities of Daily Living (ADLs) in smart home context. We exploit the interactions between the users and annotators in terms of relationships spanning across spatial and temporal space which accounts for an activity as well. We propose a novel annotator selection model SocialAnnotator which exploits the interactions between the users and annotators and rank the annotators based on their level of correspondence. We also introduce a novel approach to measure this correspondence distance using the spatial and temporal information of interactions, type of the relationships and activities. We validate our proposed SocialAnnotator framework in smart environments achieving ≈ 84% statistical confidence in data annotation 
    more » « less
  4. In this paper, we present OpenWaters, a real-time open-source underwater simulation kit for generating photorealistic underwater scenes. OpenWaters supports creation of massive amount of underwater images by emulating diverse real-world conditions. It allows for fine controls over every variable in a simulation instance, including geometry, rendering parameters like ray-traced water caustics, scattering, and ground-truth labels. Using underwater depth (distance between camera and object) estimation as the use-case, we showcase and validate the capabilities of OpenWaters to model underwater scenes that are used to train a deep neural network for depth estimation. Our experimental evaluation demonstrates depth estimation using synthetic underwater images with high accuracy, and feasibility of transfer-learning of features from synthetic to real-world images. 
    more » « less
  5. Activity Recognition (AR) models perform well with a large number of available training instances. However, in the presence of sensor heterogeneity, sensing biasness and variability of human behaviors and activities and unseen activity classes pose key challenges to adopting and scaling these pre-trained activity recognition models in the new environment. These challenging unseen activities recognition problems are addressed by applying transfer learning techniques that leverage a limited number of annotated samples and utilize the inherent structural patterns among activities within and across the source and target domains. This work proposes a novel AR framework that uses the pre-trained deep autoencoder model and generates features from source and target activity samples. Furthermore, this AR frame-work establishes correlations among activities between the source and target domain by exploiting intra- and inter-class knowledge transfer to mitigate the number of labeled samples and recognize unseen activities in the target domain. We validated the efficacy and effectiveness of our AR framework with three real-world data traces (Daily and Sports, Opportunistic, and Wisdm) that contain 41 users and 26 activities in total. Our AR framework achieves performance gains ≈ 5-6% with 111, 18, and 70 activity samples (20 % annotated samples) for Das, Opp, and Wisdm datasets. In addition, our proposed AR framework requires 56, 8, and 35 fewer activity samples (10% fewer annotated examples) for Das, Opp, and Wisdm, respectively, compared to the state-of-the-art Untran model. 
    more » « less