skip to main content


Title: HealthWalks: Sensing Fine-grained Individual Health Condition via Mobility Data
Can health conditions be inferred from an individual's mobility pattern? Existing research has discussed the relationship between individual physical activity/mobility and well-being, yet no systematic study has been done to investigate the predictability of fine-grained health conditions from mobility, largely due to the unavailability of data and unsatisfactory modelling techniques. Here, we present a large-scale longitudinal study, where we collect the health conditions of 747 individuals who visit a hospital and tracked their mobility for 2 months in Beijing, China. To facilitate fine-grained individual health condition sensing, we propose HealthWalks, an interpretable machine learning model that takes user location traces, the associated points of interest, and user social demographics as input, at the core of which a Deterministic Finite Automaton (DFA) model is proposed to auto-generate explainable features to capture useful signals. We evaluate the effectiveness of our proposed model, which achieves 40.29% in micro-F1 and 31.63% in Macro-F1 for the 8-class disease category prediction, and outperforms the best baseline by 22.84% in Micro-F1 and 31.79% in Macro-F1. In addition, deeper analysis based on the SHapley Additive exPlanations (SHAP) showcases that HealthWalks can derive meaningful insights with regard to the correlation between mobility and health conditions, which provide important research insights and design implications for mobile sensing and health informatics.  more » « less
Award ID(s):
1816889
NSF-PAR ID:
10282488
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
Volume:
4
Issue:
4
ISSN:
2474-9567
Page Range / eLocation ID:
1 to 26
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. As countries look toward re-opening of economic activities amidst the ongoing COVID-19 pandemic, ensuring public health has been challenging. While contact tracing only aims to track past activities of infected users, one path to safe reopening is to develop reliable spatiotemporal risk scores to indicate the propensity of the disease. Existing works which aim at developing risk scores either rely on compartmental model-based reproduction numbers (which assume uniform population mixing) or develop coarse-grain spatial scores based on reproduction number (R0) and macro-level density-based mobility statistics. Instead, in this article, we develop a Hawkes process-based technique to assign relatively fine-grain spatial and temporal risk scores by leveraging high-resolution mobility data based on cell-phone originated location signals. While COVID-19 risk scores also depend on a number of factors specific to an individual, including demography and existing medical conditions, the primary mode of disease transmission is via physical proximity and contact. Therefore, we focus on developing risk scores based on location density and mobility behaviour. We demonstrate the efficacy of the developed risk scores via simulation based on real-world mobility data. Our results show that fine-grain spatiotemporal risk scores based on high-resolution mobility data can provide useful insights and facilitate safe re-opening. 
    more » « less
  2. Abstract Objective

    Emerging technologies (eg, wearable devices) have made it possible to collect data directly from individuals (eg, time-series), providing new insights on the health and well-being of individual patients. Broadening the access to these data would facilitate the integration with existing data sources (eg, clinical and genomic data) and advance medical research. Compared to traditional health data, these data are collected directly from individuals, are highly unique and provide fine-grained information, posing new privacy challenges. In this work, we study the applicability of a novel privacy model to enable individual-level time-series data sharing while maintaining the usability for data analytics.

    Methods and materials

    We propose a privacy-protecting method for sharing individual-level electrocardiography (ECG) time-series data, which leverages dimensional reduction technique and random sampling to achieve provable privacy protection. We show that our solution provides strong privacy protection against an informed adversarial model while enabling useful aggregate-level analysis.

    Results

    We conduct our evaluations on 2 real-world ECG datasets. Our empirical results show that the privacy risk is significantly reduced after sanitization while the data usability is retained for a variety of clinical tasks (eg, predictive modeling and clustering).

    Discussion

    Our study investigates the privacy risk in sharing individual-level ECG time-series data. We demonstrate that individual-level data can be highly unique, requiring new privacy solutions to protect data contributors.

    Conclusion

    The results suggest our proposed privacy-protection method provides strong privacy protections while preserving the usefulness of the data.

     
    more » « less
  3. Abstract

    Water monitoring in households provides occupants and utilities with key information to support water conservation and efficiency in the residential sector. High costs, intrusiveness, and practical complexity limit appliance-level monitoring via sub-meters on every water-consuming end use in households. Non-intrusive machine learning methods have emerged as promising techniques to analyze observed data collected by a single meter at the inlet of the house and estimate the disaggregated contribution of each water end use. While fine temporal resolution data allow for more accurate end-use disaggregation, there is an inevitable increase in the amount of data that needs to be stored and analyzed. To explore this tradeoff and advance previous studies based on synthetic data, we first collected 1 s resolution indoor water use data from a residential single-point smart water metering system installed at a four-person household, as well as ground-truth end-use labels based on a water diary recorded over a 4-week study period. Second, we trained a supervised machine learning model (random forest classifier) to classify six water end-use categories across different temporal resolutions and two different model calibration scenarios. Finally, we evaluated the results based on three different performance metrics (micro, weighted, and macro F1 scores). Our findings show that data collected at 1- to 5-s intervals allow for better end-use classification (weighted F-score higher than 0.85), particularly for toilet events; however, certain water end uses (e.g., shower and washing machine events) can still be predicted with acceptable accuracy even at coarser resolutions, up to 1 min, provided that these end-use categories are well represented in the training dataset. Overall, our study provides insights for further water sustainability research and widespread deployment of smart water meters.

     
    more » « less
  4. Activity recognition has applications in a variety of human-in-the-loop settings such as smart home health monitoring, green building energy and occupancy management, intelligent transportation, and participatory sensing. While fine-grained activity recognition systems and approaches help enable a multitude of novel applications, discovering them with non-intrusive ambient sensor systems pose challenging design, as well as data processing, mining, and activity recognition issues. In this paper, we develop a low-cost heterogeneous Radar based Activity Monitoring (RAM) system for recognizing fine-grained activities. We exploit the feasibility of using an array of heterogeneous micro-doppler radars to recognize low-level activities. We prototype a short-range and a long-range radar system and evaluate the feasibility of using the system for fine-grained activity recognition. In our evaluation, using real data traces, we show that our system can detect fine-grained user activities with 92.84% accuracy. 
    more » « less
  5. The increased ubiquitousness of small smart devices, such as cell- phones, tablets, smart watches and laptops, has led to unique user data, which can be locally processed. The sensors (e.g., microphones and webcam) and improved hardware of the new devices have al- lowed running deep learning models that 20 years ago would have been exclusive to high-end expensive machines. In spite of this progress, state-of-the-art algorithms for facial expression recognition (FER) rely on architectures that cannot be implemented on these devices due to computational and memory constraints. Alternatives involving cloud-based solutions impose privacy barriers that prevent their adoption or user acceptance in wide range of applications. This paper proposes a lightweight model that can run in real-time for image facial expression recognition (IFER) and video facial expression recognition (VFER). The approach relies on a personalization mechanism locally implemented for each subject by fine-tuning a central VFER model with unlabeled videos from a target subject. We train the IFER model to generate pseudo labels and we select the videos with the highest confident predictions to be used for adaptation. The adaptation is performed by implementing a federated learning strategy where the weights of the local model are averaged and used by the central VFER model. We demonstrate that this approach can improve not only the performance on the edge device providing personalized models to the users, but also the central VFER model. We implement a federated learning strategy where the weights of the local models are averaged and used by the central VFER. Within corpus and cross-corpus evaluations on two emotional databases demonstrate that edge models adapted with our personalization strategy achieve up to 13.1% gains in F1-scores. Furthermore, the federated learning implementation improves the mean micro F1-score of the central VFER model by up to 3.4%. The proposed lightweight solution is ideal for interactive user interfaces that preserve the data of the users. 
    more » « less