skip to main content


Title: Application of Convolutional Neural Network Algorithms for Advancing Sedentary and Activity Bout Classification
Background : Machine learning has been used for classification of physical behavior bouts from hip-worn accelerometers; however, this research has been limited due to the challenges of directly observing and coding human behavior “in the wild.” Deep learning algorithms, such as convolutional neural networks (CNNs), may offer better representation of data than other machine learning algorithms without the need for engineered features and may be better suited to dealing with free-living data. The purpose of this study was to develop a modeling pipeline for evaluation of a CNN model on a free-living data set and compare CNN inputs and results with the commonly used machine learning random forest and logistic regression algorithms. Method : Twenty-eight free-living women wore an ActiGraph GT3X+ accelerometer on their right hip for 7 days. A concurrently worn thigh-mounted activPAL device captured ground truth activity labels. The authors evaluated logistic regression, random forest, and CNN models for classifying sitting, standing, and stepping bouts. The authors also assessed the benefit of performing feature engineering for this task. Results : The CNN classifier performed best (average balanced accuracy for bout classification of sitting, standing, and stepping was 84%) compared with the other methods (56% for logistic regression and 76% for random forest), even without performing any feature engineering. Conclusion : Using the recent advancements in deep neural networks, the authors showed that a CNN model can outperform other methods even without feature engineering. This has important implications for both the model’s ability to deal with the complexity of free-living data and its potential transferability to new populations.  more » « less
Award ID(s):
1942724 1826967 1730158
NSF-PAR ID:
10227390
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Journal for the Measurement of Physical Behaviour
ISSN:
2575-6605
Page Range / eLocation ID:
1 to 9
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Identifying dust aerosols from passive satellite images is of great interest for many applications. In this study, we developed five different machine-learning (ML) based algorithms, including Logistic Regression, K Nearest Neighbor, Random Forest (RF), Feed Forward Neural Network (FFNN), and Convolutional Neural Network (CNN), to identify dust aerosols in the daytime satellite images from the Visible Infrared Imaging Radiometer Suite (VIIRS) under cloud-free conditions on a global scale. In order to train the ML algorithms, we collocated the state-of-the-art dust detection product from the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) with the VIIRS observations along the CALIOP track. The 16 VIIRS M-band observations with the center wavelength ranging from deep blue to thermal infrared, together with solar-viewing geometries and pixel time and locations, are used as the predictor variables. Four different sets of training input data are constructed based on different combinations of VIIRS pixel and predictor variables. The validation and comparison results based on the collocated CALIOP data indicate that the FFNN method based on all available predictor variables is the best performing one among all methods. It has an averaged dust detection accuracy of about 81%, 89%, and 85% over land, ocean and whole globe, respectively, compared with collocated CALIOP. When applied to off-track VIIRS pixels, the FFNN method retrieves geographical distributions of dust that are in good agreement with on-track results as well as CALIOP statistics. For further evaluation, we compared our results based on the ML algorithms to NOAA’s Aerosol Detection Product (ADP), which is a product that classifies dust, smoke, and ash using physical-based methods. The comparison reveals both similarity and differences. Overall, this study demonstrates the great potential of ML methods for dust detection and proves that these methods can be trained on the CALIOP track and then applied to the whole granule of VIIRS granule. 
    more » « less
  2. Objective Sudden unexpected death in epilepsy (SUDEP) is the leading cause of epilepsy-related mortality. Although lots of effort has been made in identifying clinical risk factors for SUDEP in the literature, there are few validated methods to predict individual SUDEP risk. Prolonged postictal EEG suppression (PGES) is a potential SUDEP biomarker, but its occurrence is infrequent and requires epilepsy monitoring unit admission. We use machine learning methods to examine SUDEP risk using interictal EEG and ECG recordings from SUDEP cases and matched living epilepsy controls. Methods This multicenter, retrospective, cohort study examined interictal EEG and ECG recordings from 30 SUDEP cases and 58 age-matched living epilepsy patient controls. We trained machine learning models with interictal EEG and ECG features to predict the retrospective SUDEP risk for each patient. We assessed cross-validated classification accuracy and the area under the receiver operating characteristic (AUC) curve. Results The logistic regression (LR) classifier produced the overall best performance, outperforming the support vector machine (SVM), random forest (RF), and convolutional neural network (CNN). Among the 30 patients with SUDEP [14 females; mean age (SD), 31 (8.47) years] and 58 living epilepsy controls [26 females (43%); mean age (SD) 31 (8.5) years], the LR model achieved the median AUC of 0.77 [interquartile range (IQR), 0.73–0.80] in five-fold cross-validation using interictal alpha and low gamma power ratio of the EEG and heart rate variability (HRV) features extracted from the ECG. The LR model achieved the mean AUC of 0.79 in leave-one-center-out prediction. Conclusions Our results support that machine learning-driven models may quantify SUDEP risk for epilepsy patients, future refinements in our model may help predict individualized SUDEP risk and help clinicians correlate predictive scores with the clinical data. Low-cost and noninvasive interictal biomarkers of SUDEP risk may help clinicians to identify high-risk patients and initiate preventive strategies. 
    more » « less
  3. Malicious attacks, malware, and ransomware families pose critical security issues to cybersecurity, and it may cause catastrophic damages to computer systems, data centers, web, and mobile applications across various industries and businesses. Traditional anti-ransomware systems struggle to fight against newly created sophisticated attacks. Therefore, state-of-the-art techniques like traditional and neural network-based architectures can be immensely utilized in the development of innovative ransomware solutions. In this paper, we present a feature selection-based framework with adopting different machine learning algorithms including neural network-based architectures to classify the security level for ransomware detection and prevention. We applied multiple machine learning algorithms: Decision Tree (DT), Random Forest (RF), Naïve Bayes (NB), Logistic Regression (LR) as well as Neural Network (NN)-based classifiers on a selected number of features for ransomware classification. We performed all the experiments on one ransomware dataset to evaluate our proposed framework. The experimental results demonstrate that RF classifiers outperform other methods in terms of accuracy, F -beta, and precision scores. 
    more » « less
  4. Abstract Objective

    Body mass index (BMI) is the primary criterion differentiating anorexia nervosa (AN) and atypical anorexia nervosa despite prior literature indicating few differences between disorders. Machine learning (ML) classification provides us an efficient means of accurately distinguishing between twomeaningfulclasses given any number of features. The aim of the present study was to determine if ML algorithms can accurately distinguish AN and atypical AN given an ensemble of features excluding BMI, and if not, if the inclusion of BMI enables ML to accurately classify between the two.

    Methods

    Using an aggregate sample from seven studies consisting of individuals with AN and atypical AN who completed baseline questionnaires (N = 448), we used logistic regression, decision tree, and random forest ML classification models each trained on two datasets, one containing demographic, eating disorder, and comorbid features without BMI, and one retaining all features and BMI.

    Results

    Model performance for all algorithms trained with BMI as a feature was deemed acceptable (mean accuracy = 74.98%, mean area under the receiving operating characteristics curve [AUC] = 74.75%), whereas model performance diminished without BMI (mean accuracy = 59.37%, mean AUC = 59.98%).

    Discussion

    Model performance was acceptable, but not strong, if BMI was included as a feature; no other features meaningfully improved classification. When BMI was excluded, ML algorithms performed poorly at classifying cases of AN and atypical AN when considering other demographic and clinical characteristics. Results suggest a reconceptualization of atypical AN should be considered.

    Public Significance

    There is a growing debate about the differences between anorexia nervosa and atypical anorexia nervosa as their diagnostic differentiation relies on BMI despite being similar otherwise. We aimed to see if machine learning could distinguish between the two disorders and found accurate classification only if BMI was used as a feature. This finding calls into question the need to differentiate between the two disorders.

     
    more » « less
  5. Background : Hip-worn accelerometers are commonly used, but data processed using the 100 counts per minute cut point do not accurately measure sitting patterns. We developed and validated a model to accurately classify sitting and sitting patterns using hip-worn accelerometer data from a wide age range of older adults. Methods : Deep learning models were trained with 30-Hz triaxial hip-worn accelerometer data as inputs and activPAL sitting/nonsitting events as ground truth. Data from 981 adults aged 35–99 years from cohorts in two continents were used to train the model, which we call CHAP-Adult (Convolutional Neural Network Hip Accelerometer Posture-Adult). Validation was conducted among 419 randomly selected adults not included in model training. Results : Mean errors (activPAL − CHAP-Adult) and 95% limits of agreement were: sedentary time −10.5 (−63.0, 42.0) min/day, breaks in sedentary time 1.9 (−9.2, 12.9) breaks/day, mean bout duration −0.6 (−4.0, 2.7) min, usual bout duration −1.4 (−8.3, 5.4) min, alpha .00 (−.04, .04), and time in ≥30-min bouts −15.1 (−84.3, 54.1) min/day. Respective mean (and absolute) percent errors were: −2.0% (4.0%), −4.7% (12.2%), 4.1% (11.6%), −4.4% (9.6%), 0.0% (1.4%), and 5.4% (9.6%). Pearson’s correlations were: .96, .92, .86, .92, .78, and .96. Error was generally consistent across age, gender, and body mass index groups with the largest deviations observed for those with body mass index ≥30 kg/m 2 . Conclusions : Overall, these strong validation results indicate CHAP-Adult represents a significant advancement in the ambulatory measurement of sitting and sitting patterns using hip-worn accelerometers. Pending external validation, it could be widely applied to data from around the world to extend understanding of the epidemiology and health consequences of sitting. 
    more » « less