skip to main content


Title: SyncWISE: Window Induced Shift Estimation for Synchronization of Video and Accelerometry from Wearable Sensors
The development and validation of computational models to detect daily human behaviors (e.g., eating, smoking, brushing) using wearable devices requires labeled data collected from the natural field environment, with tight time synchronization of the micro-behaviors (e.g., start/end times of hand-to-mouth gestures during a smoking puff or an eating gesture) and the associated labels. Video data is increasingly being used for such label collection. Unfortunately, wearable devices and video cameras with independent (and drifting) clocks make tight time synchronization challenging. To address this issue, we present the Window Induced Shift Estimation method for Synchronization (SyncWISE) approach. We demonstrate the feasibility and effectiveness of our method by synchronizing the timestamps of a wearable camera and wearable accelerometer from 163 videos representing 45.2 hours of data from 21 participants enrolled in a real-world smoking cessation study. Our approach shows significant improvement over the state-of-the-art, even in the presence of high data loss, achieving 90% synchronization accuracy given a synchronization tolerance of 700 milliseconds. Our method also achieves state-of-the-art synchronization performance on the CMU-MMAC dataset.  more » « less
Award ID(s):
1823201 1915847
NSF-PAR ID:
10274462
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
Volume:
4
Issue:
3
ISSN:
2474-9567
Page Range / eLocation ID:
1 to 26
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The prevalence of mobile phones and wearable devices enables the passive capturing and modeling of human behavior at an unprecedented resolution and scale. Past research has demonstrated the capability of mobile sensing to model aspects of physical health, mental health, education, and work performance, etc. However, most of the algorithms and models proposed in previous work follow a one-size-fits-all (i.e., population modeling) approach that looks for common behaviors amongst all users, disregarding the fact that individuals can behave very differently, resulting in reduced model performance. Further, black-box models are often used that do not allow for interpretability and human behavior understanding. We present a new method to address the problems of personalized behavior classification and interpretability, and apply it to depression detection among college students. Inspired by the idea of collaborative-filtering, our method is a type of memory-based learning algorithm. It leverages the relevance of mobile-sensed behavior features among individuals to calculate personalized relevance weights, which are used to impute missing data and select features according to a specific modeling goal (e.g., whether the student has depressive symptoms) in different time epochs, i.e., times of the day and days of the week. It then compiles features from epochs using majority voting to obtain the final prediction. We apply our algorithm on a depression detection dataset collected from first-year college students with low data-missing rates and show that our method outperforms the state-of-the-art machine learning model by 5.1% in accuracy and 5.5% in F1 score. We further verify the pipeline-level generalizability of our approach by achieving similar results on a second dataset, with an average improvement of 3.4% across performance metrics. Beyond achieving better classification performance, our novel approach is further able to generate personalized interpretations of the models for each individual. These interpretations are supported by existing depression-related literature and can potentially inspire automated and personalized depression intervention design in the future 
    more » « less
  2. Abstract

    Dietary intake, eating behaviors, and context are important in chronic disease development, yet our ability to accurately assess these in research settings can be limited by biased traditional self-reporting tools. Objective measurement tools, specifically, wearable sensors, present the opportunity to minimize the major limitations of self-reported eating measures by generating supplementary sensor data that can improve the validity of self-report data in naturalistic settings. This scoping review summarizes the current use of wearable devices/sensors that automatically detect eating-related activity in naturalistic research settings. Five databases were searched in December 2019, and 618 records were retrieved from the literature search. This scoping review includedN = 40 studies (from 33 articles) that reported on one or more wearable sensors used to automatically detect eating activity in the field. The majority of studies (N = 26, 65%) used multi-sensor systems (incorporating > 1 wearable sensors), and accelerometers were the most commonly utilized sensor (N = 25, 62.5%). All studies (N = 40, 100.0%) used either self-report or objective ground-truth methods to validate the inferred eating activity detected by the sensor(s). The most frequently reported evaluation metrics were Accuracy (N = 12) and F1-score (N = 10). This scoping review highlights the current state of wearable sensors’ ability to improve upon traditional eating assessment methods by passively detecting eating activity in naturalistic settings, over long periods of time, and with minimal user interaction. A key challenge in this field, wide variation in eating outcome measures and evaluation metrics, demonstrates the need for the development of a standardized form of comparability among sensors/multi-sensor systems and multidisciplinary collaboration.

     
    more » « less
  3. With rapid growth in unhealthy diet behaviors, implementing strategies that improve healthy eating is becoming increasingly important. One approach to improving diet behavior is to continuously monitor dietary intake (e.g., calorie intake) and provide educational, motivational, and dietary recommendation feedback. Although technologies based on wearable sensors, mobile applications, and light-weight cameras exist to gather diet-related information such as food type and eating time, there remains a gap in research on how to use such information to close the loop and provide feedback to the user to improve healthy diet. We address this knowledge gap by introducing a diet behavior change framework that generates real-time diet recommendations based on a user’s food intake and considering user’s deviation from the suggested diet routine. We formulate the problem of optimal diet recommendation as a sequential decision making problem and design a greedy algorithm that provides diet recommendations such that the amount of change in user’s dietary habits is minimized while ensuring that the user’s diet goal is achieved within a given time-frame. This novel approach is inspired by the Social Cognitive Theory, which emphasizes behavioral monitoring and small incremental goals as being important to behavior change. Our optimization algorithm integrates data from a user’s past dietary intake as well as the USDA nutrition dataset to identify optimal diet changes. We demonstrate the feasibility of our optimization algorithms for diet behavior change using real-data collected in two study cohorts with a combined N=10 healthy participants who recorded their diet for up to 21 days. 
    more » « less
  4. Computer vision on low-power edge devices enables applications including search-and-rescue and security. State-of-the-art computer vision algorithms, such as Deep Neural Networks (DNNs), are too large for inference on low-power edge devices. To improve efficiency, some existing approaches parallelize DNN inference across multiple edge devices. How-ever, these techniques introduce significant communication and synchronization overheads or are unable to balance workloads across devices. This paper demonstrates that the hierarchical DNN architecture is well suited for parallel processing on multiple edge devices. We design a novel method that creates a parallel inference pipeline for computer vision problems that use hierarchical DNNs. The method balances loads across the collaborating devices and reduces communication costs to facilitate the processing of multiple video frames simultaneously with higher throughput. Our experiments consider a representative computer vision problem where image recognition is performed on each video frame, running on multiple Raspberry Pi 4Bs. With four collaborating low-power edge devices, our approach achieves 3.21× higher throughput, 68% less energy consumption per device per frame, and a 58% decrease in memory when compared with existing sinaledevice hierarchical DNNs. 
    more » « less
  5. The increased ubiquitousness of small smart devices, such as cell- phones, tablets, smart watches and laptops, has led to unique user data, which can be locally processed. The sensors (e.g., microphones and webcam) and improved hardware of the new devices have al- lowed running deep learning models that 20 years ago would have been exclusive to high-end expensive machines. In spite of this progress, state-of-the-art algorithms for facial expression recognition (FER) rely on architectures that cannot be implemented on these devices due to computational and memory constraints. Alternatives involving cloud-based solutions impose privacy barriers that prevent their adoption or user acceptance in wide range of applications. This paper proposes a lightweight model that can run in real-time for image facial expression recognition (IFER) and video facial expression recognition (VFER). The approach relies on a personalization mechanism locally implemented for each subject by fine-tuning a central VFER model with unlabeled videos from a target subject. We train the IFER model to generate pseudo labels and we select the videos with the highest confident predictions to be used for adaptation. The adaptation is performed by implementing a federated learning strategy where the weights of the local model are averaged and used by the central VFER model. We demonstrate that this approach can improve not only the performance on the edge device providing personalized models to the users, but also the central VFER model. We implement a federated learning strategy where the weights of the local models are averaged and used by the central VFER. Within corpus and cross-corpus evaluations on two emotional databases demonstrate that edge models adapted with our personalization strategy achieve up to 13.1% gains in F1-scores. Furthermore, the federated learning implementation improves the mean micro F1-score of the central VFER model by up to 3.4%. The proposed lightweight solution is ideal for interactive user interfaces that preserve the data of the users. 
    more » « less