Cardiorespiratory fitness is a predictor of long-term health, traditionally assessed through structured exercise protocols that require maximal effort and controlled laboratory conditions. These protocols, while clinically validated, are often inaccessible, physically demanding, and unsuitable for unsupervised monitoring. This study proposes a non-invasive, unsupervised alternative—predicting the heart rate a person would reach after completing the step test, using wearable data collected during natural daily activities. Ground truth post-exercise heart rate was obtained through the Queens College Step Test, which is a submaximal protocol widely used in fitness settings. Separately, wearable sensors recorded heart rate (HR), blood oxygen saturation, and motion data during a protocol of lifestyle tasks spanning a range of intensities. Two machine learning models were developed—a Human Activity Recognition (HAR) model that classified daily activities from inertial data with 96.93% accuracy, and a regression model that estimated post step test HR using motion features, physiological trends, and demographic context. The regression model achieved an average root mean squared error (RMSE) of 5.13 beats per minute (bpm) and a mean absolute error (MAE) of 4.37 bpm. These findings demonstrate the potential of test-free methods to estimate standardized test outcomes from daily activity data, offering an accessible pathway to infer cardiorespiratory fitness.
more »
« less
Daily Activities Wearable Dataset for Cardiorespiratory Fitness Estimation
{"Abstract":["This dataset was collected as part of the study "Indirect AI-Based Estimation of Cardiorespiratory Fitness from Daily Activities Using Wearables." It contains synchronized sensor data and physiological measurements from participants performing a structured sequence of daily activities. The dataset is designed to support research in human activity recognition (HAR) and indirect estimation of cardiorespiratory fitness, particularly through heart rate regression after a submaximal step test.\n\nParticipants wore a combination of inertial measurement units (IMUs) and biometric sensors in a controlled indoor environment. Each session followed a predefined activity protocol interleaving rest and effort, and a 3-minute step test to elicit a measurable cardiorespiratory response.\n\nThe dataset includes:\n\n\n\n\n\nRaw and preprocessed IMU data from the chest, hands, and knees (quaternions, accelerometers, gyroscopes).\n\n\n\n\nFrame-level activity labels aligned with the protocol (target level for HAR).\n\n\n\n\nBiomarker data: heart rate and SpO₂ sampled at 0.5 Hz.\n\n\n\n\nDemographic metadata (age, height, weight, gender, BMI, body fat %, etc.).\n\n\n\n\nStep test heart rate (target variable for regression)."]}
more »
« less
- Award ID(s):
- 2439345
- PAR ID:
- 10661946
- Publisher / Repository:
- Zenodo
- Date Published:
- Subject(s) / Keyword(s):
- HAR CRF Wearable Sensors Health Monitoring
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Human activity recognition (HAR) and, more broadly, activities of daily life recognition using wearable devices have the potential to transform a number of applications, including mobile healthcare, smart homes, and fitness monitoring. Recent approaches for HAR use multiple sensors on various locations on the body to achieve higher accuracy for complex activities. While multiple sensors increase the accuracy, they are also susceptible to reliability issues when one or more sensors are unable to provide data to the application due to sensor malfunction, user error, or energy limitations. Training multiple activity classifiers that use a subset of sensors is not desirable, since it may lead to reduced accuracy for applications. To handle these limitations, we propose a novel generative approach that recovers the missing data of sensors using data available from other sensors. The recovered data are then used to seamlessly classify activities. Experiments using three publicly available activity datasets show that with data missing from one sensor, the proposed approach achieves accuracy that is within 10% of the accuracy with no missing data. Moreover, implementation on a wearable device prototype shows that the proposed approach takes about 1.5 ms for recovering data in the w-HAR dataset, which results in an energy consumption of 606 μJ. The low-energy consumption ensures that SensorGAN is suitable for effectively recovering data in tinyML applications on energy-constrained devices.more » « less
-
null (Ed.)Human activity recognition (HAR) is growing in popularity due to its wide-ranging applications in patient rehabilitation and movement disorders. HAR approaches typically start with collecting sensor data for the activities under consideration and then develop algorithms using the dataset. As such, the success of algorithms for HAR depends on the availability and quality of datasets. Most of the existing work on HAR uses data from inertial sensors on wearable devices or smartphones to design HAR algorithms. However, inertial sensors exhibit high noise that makes it difficult to segment the data and classify the activities. Furthermore, existing approaches typically do not make their data available publicly, which makes it difficult or impossible to obtain comparisons of HAR approaches. To address these issues, we present wearable HAR (w-HAR) which contains labeled data of seven activities from 22 users. Our dataset’s unique aspect is the integration of data from inertial and wearable stretch sensors, thus providing two modalities of activity information. The wearable stretch sensor data allows us to create variable-length segment data and ensure that each segment contains a single activity. We also provide a HAR framework to use w-HAR to classify the activities. To this end, we first perform a design space exploration to choose a neural network architecture for activity classification. Then, we use two online learning algorithms to adapt the classifier to users whose data are not included at design time. Experiments on the w-HAR dataset show that our framework achieves 95% accuracy while the online learning algorithms improve the accuracy by as much as 40%.more » « less
-
null (Ed.)Automated tracking of physical fitness has sparked a health revolution by allowing individuals to track their own physical activity and health in real time. This concept is beginning to be applied to tracking of cognitive load. It is well known that activity in the brain can be measured through changes in the body’s physiology, but current real-time measures tend to be unimodal and invasive. We therefore propose the concept of a wearable educational fitness (EduFit) tracker. We use machine learning with physiological data to understand how to develop a wearable device that tracks cognitive load accurately in real time. In an initial study, we found that body temperature, skin conductance, and heart rate were able to distinguish between (i) a problem solving activity (high cognitive load), (ii) a leisure activity (moderate cognitive load), and (iii) daydreaming (low cognitive load) with high accuracy in the test dataset. In a second study, we found that these physiological features can be used to predict accurately user-reported mental focus in the test dataset, even when relatively small numbers of training data were used. We explain how these findings inform the development and implementation of a wearable device for temporal tracking and logging a user’s learning activities and cognitive load.more » « less
-
In Activities of Daily Living (ADL) research, which has gained prominence due to the burgeoning aging population, the challenge of acquiring sufficient ground truth data for model training is a significant bottleneck. This obstacle necessitates a pivot towards unsupervised representation learning methodologies, which do not require many labeled datasets. The existing research focused on the tradeoff between the fully supervised model and the unsupervised pre-trained model and found that the unsupervised version outperformed in most cases. However, their investigation did not use large enough Human Activity Recognition (HAR) datasets, both datasets resulting in 3 dimensions. This poster extends the investigation by employing a large multivariate time series HAR dataset and experimenting with the models with different combinations of critical training parameters such as batch size and learning rate to observe the performance tradeoff. Our findings reveal that the pre-trained model is comparable to the fully supervised classification with a larger multivariate time series HAR dataset. This discovery underscores the potential of unsupervised representation learning in ADL extractions and highlights the importance of model configuration in optimizing performance.more » « less
An official website of the United States government
