NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Joint Imbalance Adaptation for Radiology Report Generation

https://doi.org/10.1007/s41666-025-00205-9

Li, Wang; Han, Guangzeng; Wu, Yuexin; Huang, I-Chan; Huang, Xiaolei (June 2025, Journal of Healthcare Informatics Research)

Radiology report generation, translating radiological images into precise and clinically relevant description, may face the data imbalance challenge — medical tokens appear less frequently than regular tokens, and normal entries are significantly more than abnormal ones. However, very few studies consider the imbalance issues, not even with conjugate imbalance factors. In this study, we propose a Joint Imbalance Adaptation (JIMA) model to promote task robustness by leveraging token and label imbalance. We employ a hard-to-easy learning strategy that mitigates overfitting to frequent labels and tokens, thereby encouraging the model to focus more on infrequent labels and clinical tokens. JIMA presents notable improvements (16.75–50.50% on average) across evaluation metrics on IU X-ray and MIMIC-CXR datasets. Our ablation analysis and human evaluations show the improvements mainly come from enhancing performance on infrequent tokens and abnormal radiological entries, which can also lead to more clinically accurate reports. While data imbalance (e.g., infrequent tokens and abnormal labels) can lead to the underperformance of radiology report generation, our imbalance learning strategy opens promising directions on how to encounter data imbalance by reducing overfitting on frequent patterns and underfitting on infrequent patterns.
more » « less
Free, publicly-accessible full text available June 20, 2026
Examining Imbalance Effects on Performance and Demographic Fairness of Clinical Language Models

https://doi.org/10.1109/ICHI64645.2025.00016

Jones, Precious; Liu, Weisi; Huang, I-Chan; Huang, Xiaolei (June 2025, IEEE)

Data imbalance is a fundamental challenge in ap- plying language models to biomedical applications, particularly in ICD code prediction tasks where label and demographic distributions are uneven. While state-of-the-art language models have been increasingly adopted in biomedical tasks, few studies have systematically examined how data imbalance affects model performance and fairness across demographic groups. This study fills the gap by statistically probing the relationship between data imbalance and model performance in ICD code prediction. We analyze imbalances in a standard benchmark data across gender, age, ethnicity, and social determinants of health by state- of-the-art biomedical language models. By deploying diverse performance metrics and statistical analyses, we explore the influence of data imbalance on performance variations and demographic fairness. Our study shows that data imbalance significantly impacts model performance and fairness, but feature similarity to the majority class may be a more critical factor. We believe this study provides valuable insights for developing more equitable and robust language models in healthcare applications.
more » « less
Free, publicly-accessible full text available June 18, 2026
Leveraging natural language processing and machine learning to characterize psychological stress and life meaning and purpose in pediatric cancer survivors: a preliminary validation study

https://doi.org/10.1093/jamiaopen/ooaf018

Sim, Jin-ah; Huang, Xiaolei; Webster, Rachel T; Srivastava, Kumar; Ness, Kirsten K; Hudson, Melissa M; Baker, Justin N; Huang, I-Chan (March 2025, JAMIA Open)

Objective To determine if natural language processing (NLP) and machine learning (ML) techniques accurately identify interview-based psychological stress and meaning/purpose data in child/adolescent cancer survivors. Materials and Methods Interviews were conducted with 51 survivors (aged 8-17.9 years; ≥5-years post-therapy) from St Jude Children’s Research Hospital. Two content experts coded 244 and 513 semantic units, focusing on attributes of psychological stress (anger, controllability/manageability, fear/anxiety) and attributes of meaning/purpose (goal, optimism, purpose). Content experts extracted specific attributes from the interviews, which were designated as the gold standard. Two NLP/ML methods, Word2Vec with Extreme Gradient Boosting (XGBoost), and Bidirectional Encoder Representations from Transformers Large (BERTLarge), were validated using accuracy, areas under the receiver operating characteristic curves (AUROCC), and under the precision-recall curves (AUPRC). Results BERTLarge demonstrated higher accuracy, AUROCC, and AUPRC in identifying all attributes of psychological stress and meaning/purpose versus Word2Vec/XGBoost. BERTLarge significantly outperformed Word2Vec/XGBoost in characterizing all attributes (P <.05) except for the purpose attribute of meaning/purpose. Discussion These findings suggest that AI tools can help healthcare providers efficiently assess emotional well-being of childhood cancer survivors, supporting future clinical interventions. Conclusions NLP/ML effectively identifies interview-based data for child/adolescent cancer survivors.
more » « less
Free, publicly-accessible full text available March 6, 2026
Time Matters: Examine Temporal Effects on Biomedical Language Models

Liu, Weisi; He, Zhe; Huang, Xiaolei (November 2024, AMIA Annual Symposium Proceedings)

Time roots in applying language models for biomedical applications: models are trained on historical data and will be deployed for new or future data, which may vary from training data. While increasing biomedical tasks have employed state-of-the-art language models, there are very few studies have examined temporal effects on biomedical models when data usually shifts across development and deployment. This study fills the gap by statistically probing relations between language model performance and data shifts across three biomedical tasks. We deploy diverse metrics to evaluate model performance, distance methods to measure data drifts, and statistical methods to quantify temporal effects on biomedical language models. Our study shows that time matters for deploying biomedical language models, while the degree of performance degradation varies by biomedical tasks and statistical quantification approaches. We believe this study can establish a solid benchmark to evaluate and assess temporal effects on deploying biomedical language models.
more » « less
Free, publicly-accessible full text available November 20, 2025
Using natural language processing to analyze unstructured patient-reported outcomes data derived from electronic health records for cancer populations: a systematic review

https://doi.org/10.1080/14737167.2024.2322664

Sim, Jin-Ah; Huang, Xiaolei; Horan, Madeline R; Baker, Justin N; Huang, I-Chan (April 2024, Expert Review of Pharmacoeconomics & Outcomes Research)

Full Text Available
Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: A systematic review

https://doi.org/10.1016/j.artmed.2023.102701

Sim, Jin-ah; Huang, Xiaolei; Horan, Madeline R; Stewart, Christopher M; Robison, Leslie L; Hudson, Melissa M; Baker, Justin N; Huang, I-Chan (December 2023, Artificial Intelligence in Medicine)

Full Text Available
Token Imbalance Adaptation for Radiology Report Generation

Wu, Yuexin; Huang, I-Chan; Huang, Xiaolei (August 2023, Proceedings of Machine Learning Research)
Mortazavi, Bobak J; Sarker, Tasmie; Beam, Andrew; Ho, Joyce C (Ed.)
Imbalanced token distributions naturally exist in text documents, leading neural language models to overfit on frequent tokens. The token imbalance may dampen the robustness of radiology report generators, as complex medical terms appear less frequently but reflect more medical information. In this study, we demonstrate how current state-of-the-art models fail to generate infrequent tokens on two standard benchmark datasets (IU X-RAY and MIMIC-CXR) of radiology report generation. To solve the challenge, we propose the \textbf{T}oken \textbf{Im}balance Adapt\textbf{er} (\textit{TIMER}), aiming to improve generation robustness on infrequent tokens. The model automatically leverages token imbalance by an unlikelihood loss and dynamically optimizes generation processes to augment infrequent tokens. We compare our approach with multiple state-of-the-art methods on the two benchmarks. Experiments demonstrate the effectiveness of our approach in enhancing model robustness overall and infrequent tokens. Our ablation analysis shows that our reinforcement learning method has a major effect in adapting token imbalance for radiology report generation.
more » « less
Full Text Available

Search for: All records