skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Detecting Clinically Relevant Emotional Distress and Functional Impairment in Children and Adolescents: Protocol for an Automated Speech Analysis Algorithm Development Study
Background Even before the onset of the COVID-19 pandemic, children and adolescents were experiencing a mental health crisis, partly due to a lack of quality mental health services. The rate of suicide for Black youth has increased by 80%. By 2025, the health care system will be short of 225,000 therapists, further exacerbating the current crisis. Therefore, it is of utmost importance for providers, schools, youth mental health, and pediatric medical providers to integrate innovation in digital mental health to identify problems proactively and rapidly for effective collaboration with other health care providers. Such approaches can help identify robust, reproducible, and generalizable predictors and digital biomarkers of treatment response in psychiatry. Among the multitude of digital innovations to identify a biomarker for psychiatric diseases currently, as part of the macrolevel digital health transformation, speech stands out as an attractive candidate with features such as affordability, noninvasive, and nonintrusive. Objective The protocol aims to develop speech-emotion recognition algorithms leveraging artificial intelligence/machine learning, which can establish a link between trauma, stress, and voice types, including disrupting speech-based characteristics, and detect clinically relevant emotional distress and functional impairments in children and adolescents. Methods Informed by theoretical foundations (the Theory of Psychological Trauma Biomarkers and Archetypal Voice Categories), we developed our methodology to focus on 5 emotions: anger, happiness, fear, neutral, and sadness. Participants will be recruited from 2 local mental health centers that serve urban youths. Speech samples, along with responses to the Symptom and Functioning Severity Scale, Patient Health Questionnaire 9, and Adverse Childhood Experiences scales, will be collected using an Android mobile app. Our model development pipeline is informed by Gaussian mixture model (GMM), recurrent neural network, and long short-term memory. Results We tested our model with a public data set. The GMM with 128 clusters showed an evenly distributed accuracy across all 5 emotions. Using utterance-level features, GMM achieved an accuracy of 79.15% overall, while frame selection increased accuracy to 85.35%. This demonstrates that GMM is a robust model for emotion classification of all 5 emotions and that emotion frame selection enhances accuracy, which is significant for scientific evaluation. Recruitment and data collection for the study were initiated in August 2021 and are currently underway. The study results are likely to be available and published in 2024. Conclusions This study contributes to the literature as it addresses the need for speech-focused digital health tools to detect clinically relevant emotional distress and functional impairments in children and adolescents. The preliminary results show that our algorithm has the potential to improve outcomes. The findings will contribute to the broader digital health transformation. International Registered Report Identifier (IRRID) DERR1-10.2196/46970  more » « less
Award ID(s):
2126811
PAR ID:
10437974
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
JMIR Research Protocols
Volume:
12
ISSN:
1929-0748
Page Range / eLocation ID:
e46970
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. JMIR (Ed.)
    Psychotherapy, particularly for youth, is a pressing challenge in the health care system. Traditional methods are resource-intensive, and there is a need for objective benchmarks to guide therapeutic interventions. Automated emotion detection from speech, using artificial intelligence, presents an emerging approach to address these challenges. Speech can carry vital information about emotional states, which can be used to improve mental health care services, especially when the person is suffering. ObjectiveThis study aims to develop and evaluate automated methods for detecting the intensity of emotions (anger, fear, sadness, and happiness) in audio recordings of patients’ speech. We also demonstrate the viability of deploying the models. Our model was validated in a previous publication by Alemu et al with limited voice samples. This follow-up study used significantly more voice samples to validate the previous model. MethodsWe used audio recordings of patients, specifically children with high adverse childhood experience (ACE) scores; the average ACE score was 5 or higher, at the highest risk for chronic disease and social or emotional problems; only 1 in 6 have a score of 4 or above. The patients’ structured voice sample was collected by reading a fixed script. In total, 4 highly trained therapists classified audio segments based on a scoring process of 4 emotions and their intensity levels for each of the 4 different emotions. We experimented with various preprocessing methods, including denoising, voice-activity detection, and diarization. Additionally, we explored various model architectures, including convolutional neural networks (CNNs) and transformers. We trained emotion-specific transformer-based models and a generalized CNN-based model to predict emotion intensities. ResultsThe emotion-specific transformer-based model achieved a test-set precision and recall of 86% and 79%, respectively, for binary emotional intensity classification (high or low). In contrast, the CNN-based model, generalized to predict the intensity of 4 different emotions, achieved test-set precision and recall of 83% for each. ConclusionsAutomated emotion detection from patients’ speech using artificial intelligence models is found to be feasible, leading to a high level of accuracy. The transformer-based model exhibited better performance in emotion-specific detection, while the CNN-based model showed promise in generalized emotion detection. These models can serve as valuable decision-support tools for pediatricians and mental health providers to triage youth to appropriate levels of mental health care services. 
    more » « less
  2. Abstract Emotion regulation is a powerful predictor of youth mental health and a crucial ingredient of interventions. A growing body of evidence indicates that the beliefs individuals hold about the extent to which emotions are controllable (emotion controllability beliefs) influence both the degree and the ways in which they regulate emotions. A systematic review was conducted that investigated the associations between emotion controllability beliefs and youth anxiety and depression symptoms. The search identified 21 peer-reviewed publications that met the inclusion criteria. Believing that emotions are relatively controllable was associated with fewer anxiety and depression symptoms, in part because these beliefs were associated with more frequent use of adaptive emotion regulation strategies. These findings support theoretical models linking emotion controllability beliefs with anxiety and depression symptoms via emotion regulation strategies that target emotional experience, like reappraisal. Taken together, the review findings demonstrate that emotion controllability beliefs matter for youth mental health. Understanding emotion controllability beliefs is of prime importance for basic science and practice, as it will advance understanding of mental health and provide additional targets for managing symptoms of anxiety and depression in young people. 
    more » « less
  3. Seyedmirzaei, Homa (Ed.)
    The COVID-19 pandemic had profound effects on developing adolescents that, to date, remain incompletely understood. Youth with preexisting mental health problems and associated brain alterations were at increased risk for higher stress and poor mental health. This study investigated impacts of adolescent pre-pandemic mental health problems and their neural correlates on stress, negative emotions and poor mental health during the first 15 months of the COVID-19 pandemic. N = 2,641 adolescents (median age = 12.0 years) from the Adolescent Brain Cognitive Development (ABCD) cohort were studied, who had pre-pandemic data on anxiety, depression, and behavioral (attention, aggression, social withdrawal, internalizing, externalizing) problems, longitudinal survey data on mental health, stress and emotions during the first 15 months following the outbreak, structural MRI, and resting-state fMRI. Data were analyzed using mixed effects mediation and moderation models. Preexisting mental health and behavioral problems predicted higher stress, negative affect and negative emotions (β = 0.09–0.21, CI=[0.03,0.32]), and lower positive affect (β = −0.21 to −0.09, CI=[−0.31,-0.01]) during the first ~6 months of the outbreak. Pre-pandemic structural characteristics of brain regions supporting social function and emotional processing (insula, superior temporal gyrus, orbitofrontal cortex, and the cerebellum) mediated some of these relationships (β = 0.10–0.15, CI=[0.01,0.24]). The organization of pre-pandemic brain circuits moderated (attenuated) associations between preexisting mental health and pandemic stress and negative emotions (β = −0.17 to −0.06, CI=[−0.27,-0.01]). Preexisting mental health problems and their structural brain correlates were risk factors for youth stress and negative emotions during the early months of the outbreak. In addition, the organization of some brain circuits was protective and attenuated the effects of preexisting mental health issues on youth responses to the pandemic’s stressors. 
    more » « less
  4. In recent news, organizations have been considering the use of facial and emotion recognition for applications involving youth such as tackling surveillance and security in schools. However, the majority of efforts on facial emotion recognition research have focused on adults. Children, particularly in their early years, have been shown to express emotions quite differently than adults. Thus, before such algorithms are deployed in environments that impact the wellbeing and circumstance of youth, a careful examination should be made on their accuracy with respect to appropriateness for this target demographic. In this work, we utilize several datasets that contain facial expressions of children linked to their emotional state to evaluate eight different commercial emotion classification systems. We compare the ground truth labels provided by the respective datasets to the labels given with the highest confidence by the classification systems and assess the results in terms of matching score (TPR), positive predictive value, and failure to compute rate. Overall results show that the emotion recognition systems displayed subpar performance on the datasets of children's expressions compared to prior work with adult datasets and initial human ratings. We then identify limitations associated with automated recognition of emotions in children and provide suggestions on directions with enhancing recognition accuracy through data diversification, dataset accountability, and algorithmic regulation. 
    more » « less
  5. Abstract Early adversity is a major risk factor for the emergence of psychopathology across development. Identifying mechanisms that support resilience, or favorable mental health outcomes despite exposure to adversity, is critical for informing clinical intervention and guiding policy to promote youth mental health. Here we propose that caregivers play a central role in fostering resilience among children exposed to adversity via caregiving influences on children’s corticolimbic circuitry and emotional functioning. We first delineate the numerous ways that caregivers support youth emotional learning and regulation and describe how early attachment lays the foundation for optimal caregiver support of youth emotional functioning in a developmental stage-specific manner. Second, we outline neural mechanisms by which caregivers foster resilience—namely, by modulating offspring corticolimbic circuitry to support emotion regulation and buffer stress reactivity. Next, we highlight the importance of developmental timing and sensitive periods in understanding caregiving-related mechanisms of resilience. Finally, we discuss clinical implications of this line of research and how findings can be translated to guide policy that promotes the well-being of youth and families. 
    more » « less