skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Leveraging Collaborative-Filtering for Personalized Behavior Modeling: A Case Study of Depression Detection among College Students
The prevalence of mobile phones and wearable devices enables the passive capturing and modeling of human behavior at an unprecedented resolution and scale. Past research has demonstrated the capability of mobile sensing to model aspects of physical health, mental health, education, and work performance, etc. However, most of the algorithms and models proposed in previous work follow a one-size-fits-all (i.e., population modeling) approach that looks for common behaviors amongst all users, disregarding the fact that individuals can behave very differently, resulting in reduced model performance. Further, black-box models are often used that do not allow for interpretability and human behavior understanding. We present a new method to address the problems of personalized behavior classification and interpretability, and apply it to depression detection among college students. Inspired by the idea of collaborative-filtering, our method is a type of memory-based learning algorithm. It leverages the relevance of mobile-sensed behavior features among individuals to calculate personalized relevance weights, which are used to impute missing data and select features according to a specific modeling goal (e.g., whether the student has depressive symptoms) in different time epochs, i.e., times of the day and days of the week. It then compiles features from epochs using majority voting to obtain the final prediction. We apply our algorithm on a depression detection dataset collected from first-year college students with low data-missing rates and show that our method outperforms the state-of-the-art machine learning model by 5.1% in accuracy and 5.5% in F1 score. We further verify the pipeline-level generalizability of our approach by achieving similar results on a second dataset, with an average improvement of 3.4% across performance metrics. Beyond achieving better classification performance, our novel approach is further able to generate personalized interpretations of the models for each individual. These interpretations are supported by existing depression-related literature and can potentially inspire automated and personalized depression intervention design in the future  more » « less
Award ID(s):
2009977
PAR ID:
10287395
Author(s) / Creator(s):
Date Published:
Journal Name:
ACM transactions on humanrobot interaction
ISSN:
2573-9522
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We built and compared several machine learning models to predict future self-reported wellbeing labels (of mood, health, and stress) for next day and for up to 7 days in the future, using multi-modal data. The data are from surveys, wearables, mobile phones and weather information collected in a study from college students, each providing daily data for 30 or 90 days. We compared the performance of multiple models, including personalized multi-task models and deep learning models. The best personalized multi-task linear model showed mean absolute errors of 12.8, 11.9, and 13.7 on a continuous-100 pt scale for estimating next days mood, health, and stress value, while the best multi-task neural network model, applied to 3-way high/med/low classification of the wellbeing values showed F1 scores of 0.71, 0.74, and 0.66 on mood, health, and stress metrics, respectively. We found that features related to weather, and morning academic activities are strongly associated with wellbeing labels. We further found greater prediction accuracy among participants with the least fluctuations in their wellbeing labels. 
    more » « less
  2. Longitudinal human behavior modeling has received increasing attention over the years due to its widespread applications to patient monitoring, dietary and lifestyle recommendations, and just-in-time intervention for at-risk individuals (e.g., prob- lematic drug users and struggling students), to name a few. Using in-the-moment health data collected via ubiquitous devices (e.g., smartphones and smartwatches), this multidisciplinary field focuses on developing predictive models for certain health or well-being outcomes (e.g., depression and stress) in the short future given the time series of individual behaviors (e.g., resting heart rate, sleep quality, and current feelings). Yet, most existing models on these data, which we refer to as ubiquitous health data, do not achieve adequate accuracy. The latest works that yielded promising results have yet to consider realistic aspects of ubiquitous health data (e.g., containing features of different types and high rate of missing values) and the consumption of various resources (e.g., computing power, time, and cost). Given these two shortcomings, it is dubious whether these studies could translate to realistic settings. In this paper, we propose MuHBoost, a multi-label boosting method for addressing these shortcomings, by leveraging advanced methods in large language model (LLM) prompting and multi-label classification (MLC) to jointly predict multiple health or well-being outcomes. Because LLMs can hal- lucinate when tasked with answering multiple questions simultaneously, we also develop two variants of MuHBoost that alleviate this issue and thereby enhance its predictive performance. We conduct extensive experiments to evaluate MuH- Boost and its variants on 13 health and well-being prediction tasks defined from four realistic ubiquitous health datasets. Our results show that our three developed methods outperform all considered baselines across three standard MLC metrics, demonstrating their effectiveness while ensuring resource efficiency. 
    more » « less
  3. ABSTRACT Individualized modeling has become increasingly popular in recent years with its growing application in fields such as personalized medicine and mobile health studies. With rich longitudinal measurements, it is of great interest to model certain subject‐specific time‐varying covariate effects. In this paper, we propose an individualized time‐varying nonparametric model by leveraging the subgroup information from the population. The proposed method approximates the time‐varying covariate effect using nonparametric B‐splines and aggregates the estimated nonparametric coefficients that share common patterns. Moreover, the proposed method can effectively handle various missing data patterns that frequently arise in mobile health data. Specifically, our method achieves subgrouping by flexibly accommodating varying dimensions of B‐spline coefficients due to missingness. This capability sets it apart from other fusion‐type approaches for subgrouping. The subgroup information can also potentially provide meaningful insight into the characteristics of subjects and assist in recommending an effective treatment or intervention. An efficient ADMM algorithm is developed for implementation. Our numerical studies and application to mobile health data on monitoring pregnant women's deep sleep and physical activities demonstrate that the proposed method achieves better performance compared to other existing methods. 
    more » « less
  4. Abstract Although combination antiretroviral therapy (ART) with three or more drugs is highly effective in suppressing viral load for people with HIV (human immunodeficiency virus), many ART agents may exacerbate mental health‐related adverse effects including depression. Therefore, understanding the effects of combination ART on mental health can help clinicians personalize medicine with less adverse effects to avoid undesirable health outcomes. The emergence of electronic health records offers researchers' unprecedented access to HIV data including individuals' mental health records, drug prescriptions, and clinical information over time. However, modeling such data is challenging due to high dimensionality of the drug combination space, the individual heterogeneity, and sparseness of the observed drug combinations. To address these challenges, we develop a Bayesian nonparametric approach to learn drug combination effect on mental health in people with HIV adjusting for sociodemographic, behavioral, and clinical factors. The proposed method is built upon the subset‐tree kernel that represents drug combinations in a way that synthesizes known regimen structure into a single mathematical representation. It also utilizes a distance‐dependent Chinese restaurant process to cluster heterogeneous populations while considering individuals' treatment histories. We evaluate the proposed approach through simulation studies, and apply the method to a dataset from the Women's Interagency HIV Study, showing the clinical utility of our model in guiding clinicians to prescribe informed and effective personalized treatment based on individuals' treatment histories and clinical characteristics. 
    more » « less
  5. Traditional adversarial attacks typically aim to alter the predicted labels of input images by generating perturbations that are imperceptible to the human eye. However, these approaches often lack explainability. Moreover, most existing work on adversarial attacks focuses on single-stage classifiers, but multi-stage classifiers are largely unexplored. In this paper, we introduce instance-based adversarial attacks for multi-stage classifiers, leveraging Layer-wise Relevance Propagation (LRP), which assigns relevance scores to pixels based on their influence on classification outcomes. Our approach generates explainable adversarial perturbations by utilizing LRP to identify and target key features critical for both coarse and fine-grained classifications. Unlike conventional attacks, our method not only induces misclassification but also enhances the interpretability of the model’s behavior across classification stages, as demonstrated by experimental results. 
    more » « less