skip to main content


Title: A call for open data to develop mental health digital biomarkers
Digital biomarkers of mental health, created using data extracted from everyday technologies including smartphones, wearable devices, social media and computer interactions, have the opportunity to revolutionise mental health diagnosis and treatment by providing near-continuous unobtrusive and remote measures of behaviours associated with mental health symptoms. Machine learning models process data traces from these technologies to identify digital biomarkers. In this editorial, we caution clinicians against using digital biomarkers in practice until models are assessed for equitable predictions (‘model equity’) across demographically diverse patients at scale, behaviours over time, and data types extracted from different devices and platforms. We posit that it will be difficult for any individual clinic or large-scale study to assess and ensure model equity and alternatively call for the creation of a repository of open de-identified data for digital biomarker development.  more » « less
Award ID(s):
1750326
NSF-PAR ID:
10325326
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
BJPsych Open
Volume:
8
Issue:
2
ISSN:
2056-4724
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Background Even before the onset of the COVID-19 pandemic, children and adolescents were experiencing a mental health crisis, partly due to a lack of quality mental health services. The rate of suicide for Black youth has increased by 80%. By 2025, the health care system will be short of 225,000 therapists, further exacerbating the current crisis. Therefore, it is of utmost importance for providers, schools, youth mental health, and pediatric medical providers to integrate innovation in digital mental health to identify problems proactively and rapidly for effective collaboration with other health care providers. Such approaches can help identify robust, reproducible, and generalizable predictors and digital biomarkers of treatment response in psychiatry. Among the multitude of digital innovations to identify a biomarker for psychiatric diseases currently, as part of the macrolevel digital health transformation, speech stands out as an attractive candidate with features such as affordability, noninvasive, and nonintrusive. Objective The protocol aims to develop speech-emotion recognition algorithms leveraging artificial intelligence/machine learning, which can establish a link between trauma, stress, and voice types, including disrupting speech-based characteristics, and detect clinically relevant emotional distress and functional impairments in children and adolescents. Methods Informed by theoretical foundations (the Theory of Psychological Trauma Biomarkers and Archetypal Voice Categories), we developed our methodology to focus on 5 emotions: anger, happiness, fear, neutral, and sadness. Participants will be recruited from 2 local mental health centers that serve urban youths. Speech samples, along with responses to the Symptom and Functioning Severity Scale, Patient Health Questionnaire 9, and Adverse Childhood Experiences scales, will be collected using an Android mobile app. Our model development pipeline is informed by Gaussian mixture model (GMM), recurrent neural network, and long short-term memory. Results We tested our model with a public data set. The GMM with 128 clusters showed an evenly distributed accuracy across all 5 emotions. Using utterance-level features, GMM achieved an accuracy of 79.15% overall, while frame selection increased accuracy to 85.35%. This demonstrates that GMM is a robust model for emotion classification of all 5 emotions and that emotion frame selection enhances accuracy, which is significant for scientific evaluation. Recruitment and data collection for the study were initiated in August 2021 and are currently underway. The study results are likely to be available and published in 2024. Conclusions This study contributes to the literature as it addresses the need for speech-focused digital health tools to detect clinically relevant emotional distress and functional impairments in children and adolescents. The preliminary results show that our algorithm has the potential to improve outcomes. The findings will contribute to the broader digital health transformation. International Registered Report Identifier (IRRID) DERR1-10.2196/46970 
    more » « less
  2. Abstract

    Opioid use disorder is one of the most pressing public health problems of our time. Mobile health tools, including wearable sensors, have great potential in this space, but have been underutilized. Of specific interest are digital biomarkers, or end-user generated physiologic or behavioral measurements that correlate with health or pathology. The current manuscript describes a longitudinal, observational study of adult patients receiving opioid analgesics for acute painful conditions. Participants in the study are monitored with a wrist-worn E4 sensor, during which time physiologic parameters (heart rate/variability, electrodermal activity, skin temperature, and accelerometry) are collected continuously. Opioid use events are recorded via electronic medical record and self-report. Three-hundred thirty-nine discreet dose opioid events from 36 participant are analyzed among 2070 h of sensor data. Fifty-one features are extracted from the data and initially compared pre- and post-opioid administration, and subsequently are used to generate machine learning models. Model performance is compared based on individual and treatment characteristics. The best performing machine learning model to detect opioid administration is a Channel-Temporal Attention-Temporal Convolutional Network (CTA-TCN) model using raw data from the wearable sensor. History of intravenous drug use is associated with better model performance, while middle age, and co-administration of non-narcotic analgesia or sedative drugs are associated with worse model performance. These characteristics may be candidate input features for future opioid detection model iterations. Once mature, this technology could provide clinicians with actionable data on opioid use patterns in real-world settings, and predictive analytics for early identification of opioid use disorder risk.

     
    more » « less
  3. Background

    Maternal loneliness is associated with adverse physical and mental health outcomes for both the mother and her child. Detecting maternal loneliness noninvasively through wearable devices and passive sensing provides opportunities to prevent or reduce the impact of loneliness on the health and well-being of the mother and her child.

    Objective

    The aim of this study is to use objective health data collected passively by a wearable device to predict maternal (social) loneliness during pregnancy and the postpartum period and identify the important objective physiological parameters in loneliness detection.

    Methods

    We conducted a longitudinal study using smartwatches to continuously collect physiological data from 31 women during pregnancy and the postpartum period. The participants completed the University of California, Los Angeles (UCLA) loneliness questionnaire in gestational week 36 and again at 12 weeks post partum. Responses to this questionnaire and background information of the participants were collected through our customized cross-platform mobile app. We leveraged participants’ smartwatch data from the 7 days before and the day of their completion of the UCLA questionnaire for loneliness prediction. We categorized the loneliness scores from the UCLA questionnaire as loneliness (scores≥12) and nonloneliness (scores<12). We developed decision tree and gradient-boosting models to predict loneliness. We evaluated the models by using leave-one-participant-out cross-validation. Moreover, we discussed the importance of extracted health parameters in our models for loneliness prediction.

    Results

    The gradient boosting and decision tree models predicted maternal social loneliness with weighted F1-scores of 0.897 and 0.872, respectively. Our results also show that loneliness is highly associated with activity intensity and activity distribution during the day. In addition, resting heart rate (HR) and resting HR variability (HRV) were correlated with loneliness.

    Conclusions

    Our results show the potential benefit and feasibility of using passive sensing with a smartwatch to predict maternal loneliness. Our developed machine learning models achieved a high F1-score for loneliness prediction. We also show that intensity of activity, activity pattern, and resting HR and HRV are good predictors of loneliness. These results indicate the intervention opportunities made available by wearable devices and predictive models to improve maternal well-being through early detection of loneliness.

     
    more » « less
  4. Bondi, Mark (Ed.)
    Background: Advantages of digital clock drawing metrics for dementia subtype classification needs examination. Objective: To assess how well kinematic, time-based, and visuospatial features extracted from the digital Clock Drawing Test (dCDT) can classify a combined group of Alzheimer’s disease/Vascular Dementia patients versus healthy controls (HC), and classify dementia patients with Alzheimer’s disease (AD) versus vascular dementia (VaD). Methods: Healthy, community-dwelling control participants (n = 175), patients diagnosed clinically with Alzheimer’s disease (n = 29), and vascular dementia (n = 27) completed the dCDT to command and copy clock drawing conditions. Thirty-seven dCDT command and 37 copy dCDT features were extracted and used with Random Forest classification models. Results: When HC participants were compared to participants with dementia, optimal area under the curve was achieved using models that combined both command and copy dCDT features (AUC = 91.52%). Similarly, when AD versus VaD participants were compared, optimal area under the curve was, achieved with models that combined both command and copy features (AUC = 76.94%). Subsequent follow-up analyses of a corpus of 10 variables of interest determined using a Gini Index found that groups could be dissociated based on kinematic, time-based, and visuospatial features. Conclusion: The dCDT is able to operationally define graphomotor output that cannot be measured using traditional paper and pencil test administration in older health controls and participants with dementia. These data suggest that kinematic, time-based, and visuospatial behavior obtained using the dCDT may provide additional neurocognitive biomarkers that may be able to identify and tract dementia syndromes. 
    more » « less
  5. Abstract

    Neuropsychiatric disorders pose a high societal cost, but their treatment is hindered by lack of objective outcomes and fidelity metrics. AI technologies and specifically Natural Language Processing (NLP) have emerged as tools to study mental health interventions (MHI) at the level of their constituent conversations. However, NLP’s potential to address clinical and research challenges remains unclear. We therefore conducted a pre-registered systematic review of NLP-MHI studies using PRISMA guidelines (osf.io/s52jh) to evaluate their models, clinical applications, and to identify biases and gaps. Candidate studies (n = 19,756), including peer-reviewed AI conference manuscripts, were collected up to January 2023 through PubMed, PsycINFO, Scopus, Google Scholar, and ArXiv. A total of 102 articles were included to investigate their computational characteristics (NLP algorithms, audio features, machine learning pipelines, outcome metrics), clinical characteristics (clinical ground truths, study samples, clinical focus), and limitations. Results indicate a rapid growth of NLP MHI studies since 2019, characterized by increased sample sizes and use of large language models. Digital health platforms were the largest providers of MHI data. Ground truth for supervised learning models was based on clinician ratings (n = 31), patient self-report (n = 29) and annotations by raters (n = 26). Text-based features contributed more to model accuracy than audio markers. Patients’ clinical presentation (n = 34), response to intervention (n = 11), intervention monitoring (n = 20), providers’ characteristics (n = 12), relational dynamics (n = 14), and data preparation (n = 4) were commonly investigated clinical categories. Limitations of reviewed studies included lack of linguistic diversity, limited reproducibility, and population bias. A research framework is developed and validated (NLPxMHI) to assist computational and clinical researchers in addressing the remaining gaps in applying NLP to MHI, with the goal of improving clinical utility, data access, and fairness.

     
    more » « less