Homekit2020: A Benchmark for Time Series Classification on a Large Mobile Sensing Dataset with Laboratory Tested Ground Truth of Influenza Infections

Merrill, Mike A; Safranchik, Esteban; Kolbeinsson, Arinbjörn; Gade, Piyusha; Ramirez, Ernesto; Schmidt, Ludwig; Foschini, Luca; Althoff, Tim

Citation Details

Despite increased interest in wearables as tools for detecting various health conditions, there are not as of yet any large public benchmarks for such mobile sensing data. The few datasets that are available do not contain data from more than dozens of individuals, do not contain high-resolution raw data or do not include dataloaders for easy integration into machine learning pipelines. Here, we present Homekit2020: the first large-scale public benchmark for time series classification of wearable sensor data. Our dataset contains over 14 million hours of minute-level multimodal Fitbit data, symptom reports, and ground-truth laboratory PCR influenza test results, along with an evaluation framework that mimics realistic model deployments and efficiently characterizes statistical uncertainty in model selection in the presence of extreme class imbalance. Furthermore, we implement and evaluate nine neural and non-neural time series classification models on our benchmark across 450 total training runs in order to establish state of the art performance. more »

Award ID(s):: 1901386 2142794

PAR ID:: 10435881

Author(s) / Creator(s):: Merrill, Mike A; Safranchik, Esteban; Kolbeinsson, Arinbjörn; Gade, Piyusha; Ramirez, Ernesto; Schmidt, Ludwig; Foschini, Luca; Althoff, Tim

Date Published:: 2023-01-01

Journal Name:: Conference on Health, Inference, and Learning (CHIL)

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this