Leveraging natural language processing and machine learning to characterize psychological stress and life meaning and purpose in pediatric cancer survivors: a preliminary validation study

Sim, Jin-ah; Huang, Xiaolei; Webster, Rachel T; Srivastava, Kumar; Ness, Kirsten K; Hudson, Melissa M; Baker, Justin N; Huang, I-Chan

doi:10.1093/jamiaopen/ooaf018

Citation Details

This content will become publicly available on March 6, 2026

Leveraging natural language processing and machine learning to characterize psychological stress and life meaning and purpose in pediatric cancer survivors: a preliminary validation study

Objective To determine if natural language processing (NLP) and machine learning (ML) techniques accurately identify interview-based psychological stress and meaning/purpose data in child/adolescent cancer survivors. Materials and Methods Interviews were conducted with 51 survivors (aged 8-17.9 years; ≥5-years post-therapy) from St Jude Children’s Research Hospital. Two content experts coded 244 and 513 semantic units, focusing on attributes of psychological stress (anger, controllability/manageability, fear/anxiety) and attributes of meaning/purpose (goal, optimism, purpose). Content experts extracted specific attributes from the interviews, which were designated as the gold standard. Two NLP/ML methods, Word2Vec with Extreme Gradient Boosting (XGBoost), and Bidirectional Encoder Representations from Transformers Large (BERTLarge), were validated using accuracy, areas under the receiver operating characteristic curves (AUROCC), and under the precision-recall curves (AUPRC). Results BERTLarge demonstrated higher accuracy, AUROCC, and AUPRC in identifying all attributes of psychological stress and meaning/purpose versus Word2Vec/XGBoost. BERTLarge significantly outperformed Word2Vec/XGBoost in characterizing all attributes (P <.05) except for the purpose attribute of meaning/purpose. Discussion These findings suggest that AI tools can help healthcare providers efficiently assess emotional well-being of childhood cancer survivors, supporting future clinical interventions. Conclusions NLP/ML effectively identifies interview-based data for child/adolescent cancer survivors. more »

Award ID(s):: 2245920

PAR ID:: 10621306

Author(s) / Creator(s):: Sim, Jin-ah; Huang, Xiaolei; Webster, Rachel T; Srivastava, Kumar; Ness, Kirsten K; Hudson, Melissa M; Baker, Justin N; Huang, I-Chan

Publisher / Repository:: Oxford University Press

Date Published:: 2025-03-06

Journal Name:: JAMIA Open

Volume:: 8

Issue:: 2

ISSN:: 2574-2531

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on March 6, 2026
Journal Article:
https://doi.org/10.1093/jamiaopen/ooaf018

More Like this