skip to main content


Title: Leveraging Collaborative-Filtering for Personalized Behavior Modeling: A Case Study of Depression Detection among College Students
The prevalence of mobile phones and wearable devices enables the passive capturing and modeling of human behavior at an unprecedented resolution and scale. Past research has demonstrated the capability of mobile sensing to model aspects of physical health, mental health, education, and work performance, etc. However, most of the algorithms and models proposed in previous work follow a one-size-fits-all (i.e., population modeling) approach that looks for common behaviors amongst all users, disregarding the fact that individuals can behave very differently, resulting in reduced model performance. Further, black-box models are often used that do not allow for interpretability and human behavior understanding. We present a new method to address the problems of personalized behavior classification and interpretability, and apply it to depression detection among college students. Inspired by the idea of collaborative-filtering, our method is a type of memory-based learning algorithm. It leverages the relevance of mobile-sensed behavior features among individuals to calculate personalized relevance weights, which are used to impute missing data and select features according to a specific modeling goal (e.g., whether the student has depressive symptoms) in different time epochs, i.e., times of the day and days of the week. It then compiles features from epochs using majority voting to obtain the final prediction. We apply our algorithm on a depression detection dataset collected from first-year college students with low data-missing rates and show that our method outperforms the state-of-the-art machine learning model by 5.1% in accuracy and 5.5% in F1 score. We further verify the pipeline-level generalizability of our approach by achieving similar results on a second dataset, with an average improvement of 3.4% across performance metrics. Beyond achieving better classification performance, our novel approach is further able to generate personalized interpretations of the models for each individual. These interpretations are supported by existing depression-related literature and can potentially inspire automated and personalized depression intervention design in the future  more » « less
Award ID(s):
2009977
NSF-PAR ID:
10287395
Author(s) / Creator(s):
Date Published:
Journal Name:
ACM transactions on humanrobot interaction
ISSN:
2573-9522
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This work presents SeizFt—a novel seizure detection framework that utilizes machine learning to automatically detect seizures using wearable SensorDot EEG data. Inspired by interpretable sleep staging, our novel approach employs a unique combination of data augmentation, meaningful feature extraction, and an ensemble of decision trees to improve resilience to variations in EEG and to increase the capacity to generalize to unseen data. Fourier Transform (FT) Surrogates were utilized to increase sample size and improve the class balance between labeled non-seizure and seizure epochs. To enhance model stability and accuracy, SeizFt utilizes an ensemble of decision trees through the CatBoost classifier to classify each second of EEG recording as seizure or non-seizure. The SeizIt1 dataset was used for training, and the SeizIt2 dataset for validation and testing. Model performance for seizure detection was evaluated using two primary metrics: sensitivity using the any-overlap method (OVLP) and False Alarm (FA) rate using epoch-based scoring (EPOCH). Notably, SeizFt placed first among an array of state-of-the-art seizure detection algorithms as part of the Seizure Detection Grand Challenge at the 2023 International Conference on Acoustics, Speech, and Signal Processing (ICASSP). SeizFt outperformed state-of-the-art black-box models in accurate seizure detection and minimized false alarms, obtaining a total score of 40.15, combining OVLP and EPOCH across two tasks and representing an improvement of ~30% from the next best approach. The interpretability of SeizFt is a key advantage, as it fosters trust and accountability among healthcare professionals. The most predictive seizure detection features extracted from SeizFt were: delta wave, interquartile range, standard deviation, total absolute power, theta wave, the ratio of delta to theta, binned entropy, Hjorth complexity, delta + theta, and Higuchi fractal dimension. In conclusion, the successful application of SeizFt to wearable SensorDot data suggests its potential for real-time, continuous monitoring to improve personalized medicine for epilepsy.

     
    more » « less
  2. Abstract

    Although combination antiretroviral therapy (ART) with three or more drugs is highly effective in suppressing viral load for people with HIV (human immunodeficiency virus), many ART agents may exacerbate mental health‐related adverse effects including depression. Therefore, understanding the effects of combination ART on mental health can help clinicians personalize medicine with less adverse effects to avoid undesirable health outcomes. The emergence of electronic health records offers researchers' unprecedented access to HIV data including individuals' mental health records, drug prescriptions, and clinical information over time. However, modeling such data is challenging due to high dimensionality of the drug combination space, the individual heterogeneity, and sparseness of the observed drug combinations. To address these challenges, we develop a Bayesian nonparametric approach to learn drug combination effect on mental health in people with HIV adjusting for sociodemographic, behavioral, and clinical factors. The proposed method is built upon the subset‐tree kernel that represents drug combinations in a way that synthesizes known regimen structure into a single mathematical representation. It also utilizes a distance‐dependent Chinese restaurant process to cluster heterogeneous populations while considering individuals' treatment histories. We evaluate the proposed approach through simulation studies, and apply the method to a dataset from the Women's Interagency HIV Study, showing the clinical utility of our model in guiding clinicians to prescribe informed and effective personalized treatment based on individuals' treatment histories and clinical characteristics.

     
    more » « less
  3. We built and compared several machine learning models to predict future self-reported wellbeing labels (of mood, health, and stress) for next day and for up to 7 days in the future, using multi-modal data. The data are from surveys, wearables, mobile phones and weather information collected in a study from college students, each providing daily data for 30 or 90 days. We compared the performance of multiple models, including personalized multi-task models and deep learning models. The best personalized multi-task linear model showed mean absolute errors of 12.8, 11.9, and 13.7 on a continuous-100 pt scale for estimating next days mood, health, and stress value, while the best multi-task neural network model, applied to 3-way high/med/low classification of the wellbeing values showed F1 scores of 0.71, 0.74, and 0.66 on mood, health, and stress metrics, respectively. We found that features related to weather, and morning academic activities are strongly associated with wellbeing labels. We further found greater prediction accuracy among participants with the least fluctuations in their wellbeing labels. 
    more » « less
  4. null (Ed.)
    Abstract Radiogenomics uses machine-learning (ML) to directly connect the morphologic and physiological appearance of tumors on clinical imaging with underlying genomic features. Despite extensive growth in the area of radiogenomics across many cancers, and its potential role in advancing clinical decision making, no published studies have directly addressed uncertainty in these model predictions. We developed a radiogenomics ML model to quantify uncertainty using transductive Gaussian Processes (GP) and a unique dataset of 95 image-localized biopsies with spatially matched MRI from 25 untreated Glioblastoma (GBM) patients. The model generated predictions for regional EGFR amplification status (a common and important target in GBM) to resolve the intratumoral genetic heterogeneity across each individual tumor—a key factor for future personalized therapeutic paradigms. The model used probability distributions for each sample prediction to quantify uncertainty, and used transductive learning to reduce the overall uncertainty. We compared predictive accuracy and uncertainty of the transductive learning GP model against a standard GP model using leave-one-patient-out cross validation. Additionally, we used a separate dataset containing 24 image-localized biopsies from 7 high-grade glioma patients to validate the model. Predictive uncertainty informed the likelihood of achieving an accurate sample prediction. When stratifying predictions based on uncertainty, we observed substantially higher performance in the group cohort (75% accuracy, n = 95) and amongst sample predictions with the lowest uncertainty (83% accuracy, n = 72) compared to predictions with higher uncertainty (48% accuracy, n = 23), due largely to data interpolation (rather than extrapolation). On the separate validation set, our model achieved 78% accuracy amongst the sample predictions with lowest uncertainty. We present a novel approach to quantify radiogenomics uncertainty to enhance model performance and clinical interpretability. This should help integrate more reliable radiogenomics models for improved medical decision-making. 
    more » « less
  5. There is a growing body of research revealing that longitudinal passive sensing data from smartphones and wearable devices can capture daily behavior signals for human behavior modeling, such as depression detection. Most prior studies build and evaluate machine learning models using data collected from a single population. However, to ensure that a behavior model can work for a larger group of users, its generalizability needs to be verified on multiple datasets from different populations. We present the first work evaluating cross-dataset generalizability of longitudinal behavior models, using depression detection as an application. We collect multiple longitudinal passive mobile sensing datasets with over 500 users from two institutes over a two-year span, leading to four institute-year datasets. Using the datasets, we closely re-implement and evaluated nine prior depression detection algorithms. Our experiment reveals the lack of model generalizability of these methods. We also implement eight recently popular domain generalization algorithms from the machine learning community. Our results indicate that these methods also do not generalize well on our datasets, with barely any advantage over the naive baseline of guessing the majority. We then present two new algorithms with better generalizability. Our new algorithm, Reorder, significantly and consistently outperforms existing methods on most cross-dataset generalization setups. However, the overall advantage is incremental and still has great room for improvement. Our analysis reveals that the individual differences (both within and between populations) may play the most important role in the cross-dataset generalization challenge. Finally, we provide an open-source benchmark platform GLOBEM- short for Generalization of Longitudinal BEhavior Modeling - to consolidate all 19 algorithms. GLOBEM can support researchers in using, developing, and evaluating different longitudinal behavior modeling methods. We call for researchers' attention to model generalizability evaluation for future longitudinal human behavior modeling studies. 
    more » « less