Abstract ObjectivesSuicide presents a major public health challenge worldwide, affecting people across the lifespan. While previous studies revealed strong associations between Social Determinants of Health (SDoH) and suicide deaths, existing evidence is limited by the reliance on structured data. To resolve this, we aim to adapt a suicide-specific SDoH ontology (Suicide-SDoHO) and use natural language processing (NLP) to effectively identify individual-level SDoH-related social risks from death investigation narratives. Materials and MethodsWe used the latest National Violent Death Report System (NVDRS), which contains 267 804 victim suicide data from 2003 to 2019. After adapting the Suicide-SDoHO, we developed a transformer-based model to identify SDoH-related circumstances and crises in death investigation narratives. We applied our model retrospectively to annotate narratives whose crisis variables were not coded in NVDRS. The crisis rates were calculated as the percentage of the group’s total suicide population with the crisis present. ResultsThe Suicide-SDoHO contains 57 fine-grained circumstances in a hierarchical structure. Our classifier achieves AUCs of 0.966 and 0.942 for classifying circumstances and crises, respectively. Through the crisis trend analysis, we observed that not everyone is equally affected by SDoH-related social risks. For the economic stability crisis, our result showed a significant increase in crisis rate in 2007–2009, parallel with the Great Recession. ConclusionsThis is the first study curating a Suicide-SDoHO using death investigation narratives. We showcased that our model can effectively classify SDoH-related social risks through NLP approaches. We hope our study will facilitate the understanding of suicide crises and inform effective prevention strategies. 
                        more » 
                        « less   
                    
                            
                            A machine learning approach to identifying suicide risk among text-based crisis counseling encounters
                        
                    
    
            IntroductionWith the increasing utilization of text-based suicide crisis counseling, new means of identifying at risk clients must be explored. Natural language processing (NLP) holds promise for evaluating the content of crisis counseling; here we use a data-driven approach to evaluate NLP methods in identifying client suicide risk. MethodsDe-identified crisis counseling data from a regional text-based crisis encounter and mobile tipline application were used to evaluate two modeling approaches in classifying client suicide risk levels. A manual evaluation of model errors and system behavior was conducted. ResultsThe neural model outperformed a term frequency-inverse document frequency (tf-idf) model in the false-negative rate. While 75% of the neural model’s false negative encounters had some discussion of suicidality, 62.5% saw a resolution of the client’s initial concerns. Similarly, the neural model detected signals of suicidality in 60.6% of false-positive encounters. DiscussionThe neural model demonstrated greater sensitivity in the detection of client suicide risk. A manual assessment of errors and model performance reflected these same findings, detecting higher levels of risk in many of the false-positive encounters and lower levels of risk in many of the false negatives. NLP-based models can detect the suicide risk of text-based crisis encounters from the encounter’s content. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1822877
- PAR ID:
- 10501428
- Publisher / Repository:
- Frontiers in Psychiatry
- Date Published:
- Journal Name:
- Frontiers in Psychiatry
- Volume:
- 14
- ISSN:
- 1664-0640
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Abstract ObjectiveThe current paper examines the intersection between social vulnerability, individual risk, and social/psychological resources with adult suicidality during the COVID‐19 pandemic. MethodData come from a national sample (n = 10,368) of U.S. adults. Using an online platform, information was gathered during the third week of March 2020, and post‐stratification weighted to proportionally represent the U.S. population in terms of age, gender, race/ethnicity, income, and geography. ResultsNearly 15 percent of sampled respondents were categorized as high risk, scoring 7+ on the Suicide Behaviors Questionnaire‐Revised (SBQ‐R). This level of risk varied across social vulnerability groupings: Blacks, Native Americans, Hispanics, families with children, unmarried, and younger respondents reported higher SBQ‐R scores than their counterparts (p < .000). Regression results confirm these bivariate differences and also reveal that risk factors (food insecurity, physical symptoms, and CES‐D symptomatology) are positive and significantly related to suicidality (p < .000). Additionally, resource measures are significant and negatively related to suicidality (p < .000). ConclusionsThese results provide some insight on the impact COVID‐19 is having on the general U.S. population. Practitioners should be prepared for what will likely be a significant mental health fall‐out in the months and years ahead.more » « less
- 
            Predicting polycystic ovary syndrome with machine learning algorithms from electronic health recordsIntroductionPredictive models have been used to aid early diagnosis of PCOS, though existing models are based on small sample sizes and limited to fertility clinic populations. We built a predictive model using machine learning algorithms based on an outpatient population at risk for PCOS to predict risk and facilitate earlier diagnosis, particularly among those who meet diagnostic criteria but have not received a diagnosis. MethodsThis is a retrospective cohort study from a SafetyNet hospital’s electronic health records (EHR) from 2003-2016. The study population included 30,601 women aged 18-45 years without concurrent endocrinopathy who had any visit to Boston Medical Center for primary care, obstetrics and gynecology, endocrinology, family medicine, or general internal medicine. Four prediction outcomes were assessed for PCOS. The first outcome was PCOS ICD-9 diagnosis with additional model outcomes of algorithm-defined PCOS. The latter was based on Rotterdam criteria and merging laboratory values, radiographic imaging, and ICD data from the EHR to define irregular menstruation, hyperandrogenism, and polycystic ovarian morphology on ultrasound. ResultsWe developed predictive models using four machine learning methods: logistic regression, supported vector machine, gradient boosted trees, and random forests. Hormone values (follicle-stimulating hormone, luteinizing hormone, estradiol, and sex hormone binding globulin) were combined to create a multilayer perceptron score using a neural network classifier. Prediction of PCOS prior to clinical diagnosis in an out-of-sample test set of patients achieved an average AUC of 85%, 81%, 80%, and 82%, respectively in Models I, II, III and IV. Significant positive predictors of PCOS diagnosis across models included hormone levels and obesity; negative predictors included gravidity and positive bHCG. ConclusionMachine learning algorithms were used to predict PCOS based on a large at-risk population. This approach may guide early detection of PCOS within EHR-interfaced populations to facilitate counseling and interventions that may reduce long-term health consequences. Our model illustrates the potential benefits of an artificial intelligence-enabled provider assistance tool that can be integrated into the EHR to reduce delays in diagnosis. However, model validation in other hospital-based populations is necessary.more » « less
- 
            Objective:The COVID-19 pandemic has put unprecedented stress on essential workers and their children. Limited cross-sectional research has found increases in mental health conditions from workload, reduced income, and isolation among essential workers. Less research has been conducted on children of essential workers. We examined trends in the crisis response of essential workers and their children from April 2020 through August 2021. Methods:We investigated the impact during 3 periods of the pandemic on workers and their children using anonymized data from the Crisis Text Line on crisis help-seeking texts for thoughts of suicide or active suicidal ideation (desire, intent, capability, time frame), abuse (emotional, physical, sexual, unspecified), anxiety/stress, grief, depression, isolation, bullying, eating or body image, gender/sexual identity, self-harm, and substance use. We used generalized estimating equations to study the longitudinal change in crisis response across the later stages of the pandemic using adjusted odds ratios (aORs) for worker status and crisis outcomes. Results:Results demonstrated higher odds of crisis outcomes for thoughts of suicide (aOR = 1.06; 95% CI, 1.00-1.12) and suicide capability (aOR = 1.14; 95% CI, 1.02-1.27) among essential workers than among nonessential workers. Children of essential workers had higher odds of substance use than children of nonessential workers (aOR = 1.33; 95% CI, 1.08-1.65), particularly for Indigenous American children (aOR = 2.76; 95% CI, 1.35-5.36). Essential workers (aOR = 1.17; 95% CI, 1.07-1.27) and their children (aOR = 1.18; 95% CI, 1.07-1.30) had higher odds of grief than nonessential workers and their children. Conclusion:Essential workers and their children had elevated crisis outcomes. Immediate and low-cost psychologically supportive interventions are needed to mitigate the mental health impacts of the COVID-19 pandemic on these populations.more » « less
- 
            Abstract BackgroundWhile most health-care providers now use electronic health records (EHRs) to document clinical care, many still treat them as digital versions of paper records. As a result, documentation often remains unstructured, with free-text entries in progress notes. This limits the potential for secondary use and analysis, as machine-learning and data analysis algorithms are more effective with structured data. ObjectiveThis study aims to use advanced artificial intelligence (AI) and natural language processing (NLP) techniques to improve diagnostic information extraction from clinical notes in a periodontal use case. By automating this process, the study seeks to reduce missing data in dental records and minimize the need for extensive manual annotation, a long-standing barrier to widespread NLP deployment in dental data extraction. Materials and MethodsThis research utilizes large language models (LLMs), specifically Generative Pretrained Transformer 4, to generate synthetic medical notes for fine-tuning a RoBERTa model. This model was trained to better interpret and process dental language, with particular attention to periodontal diagnoses. Model performance was evaluated by manually reviewing 360 clinical notes randomly selected from each of the participating site’s dataset. ResultsThe results demonstrated high accuracy of periodontal diagnosis data extraction, with the sites 1 and 2 achieving a weighted average score of 0.97-0.98. This performance held for all dimensions of periodontal diagnosis in terms of stage, grade, and extent. DiscussionSynthetic data effectively reduced manual annotation needs while preserving model quality. Generalizability across institutions suggests viability for broader adoption, though future work is needed to improve contextual understanding. ConclusionThe study highlights the potential transformative impact of AI and NLP on health-care research. Most clinical documentation (40%-80%) is free text. Scaling our method could enhance clinical data reuse.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    