skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: CohortNet: Empowering Cohort Discovery for Interpretable Healthcare Analytics
Cohort studies are of significant importance in the field of healthcare analytics. However, existing methods typically involve manual, labor-intensive, and expert-driven pattern definitions or rely on simplistic clustering techniques that lack medical relevance. Automating cohort studies with interpretable patterns has great potential to facilitate healthcare analytics and data management but remains an unmet need in prior research efforts. In this paper, we present a cohort auto-discovery framework for interpretable healthcare analytics. It focuses on the effective identification, representation, and exploitation of cohorts characterized by medically meaningful patterns. In the framework, we propose CohortNet, a core model that can learn fine-grained patient representations by separately processing each feature, considering both individual feature trends and feature interactions at each time step. Subsequently, it employs K-Means in an adaptive manner to classify each feature into distinct states and a heuristic cohort exploration strategy to effectively discover substantial cohorts with concrete patterns. For each identified cohort, it learns comprehensive cohort representations with credible evidence through associated patient retrieval. Ultimately, given a new patient, CohortNet can leverage relevant cohorts with distinguished importance which can provide a more holistic understanding of the patient's conditions. Extensive experiments on three real-world datasets demonstrate that it consistently outperforms state-of-the-art approaches, resulting in improvements in AUC-PR scores ranging from 2.8% to 4.1%, and offers interpretable insights from diverse perspectives in a top-down fashion.  more » « less
Award ID(s):
2312931 2106176
PAR ID:
10612920
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
VLDB
Date Published:
Journal Name:
Proceedings of the VLDB Endowment
Volume:
17
Issue:
10
ISSN:
2150-8097
Page Range / eLocation ID:
2487 to 2500
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Objective Visual cohort analysis utilizing electronic health record data has become an important tool in clinical assessment of patient outcomes. In this article, we introduce Composer, a visual analysis tool for orthopedic surgeons to compare changes in physical functions of a patient cohort following various spinal procedures. The goal of our project is to help researchers analyze outcomes of procedures and facilitate informed decision-making about treatment options between patient and clinician. Methods In collaboration with orthopedic surgeons and researchers, we defined domain-specific user requirements to inform the design. We developed the tool in an iterative process with our collaborators to develop and refine functionality. With Composer, analysts can dynamically define a patient cohort using demographic information, clinical parameters, and events in patient medical histories and then analyze patient-reported outcome scores for the cohort over time, as well as compare it to other cohorts. Using Composer's current iteration, we provide a usage scenario for use of the tool in a clinical setting. Conclusion We have developed a prototype cohort analysis tool to help clinicians assess patient treatment options by analyzing prior cases with similar characteristics. Although Composer was designed using patient data specific to orthopedic research, we believe the tool is generalizable to other healthcare domains. A long-term goal for Composer is to develop the application into a shared decision-making tool that allows translation of comparison and analysis from a clinician-facing interface into visual representations to communicate treatment options to patients. 
    more » « less
  2. We propose and test a novel graph learning-based explainable artificial intelligence (XAI) approach to address the challenge of developing explainable predictions of patient length of stay (LoS) in intensive care units (ICUs). Specifically, we address a notable gap in the literature on XAI methods that identify interactions between model input features to predict patient health outcomes. Our model intrinsically constructs a patient-level graph, which identifies the importance of feature interactions for prediction of health outcomes. It demonstrates state-of-the-art explanation capabilities based on identification of salient feature interactions compared with traditional XAI methods for prediction of LoS. We supplement our XAI approach with a small-scale user study, which demonstrates that our model can lead to greater user acceptance of artificial intelligence (AI) model-based decisions by contributing to greater interpretability of model predictions. Our model lays the foundation to develop interpretable, predictive tools that healthcare professionals can utilize to improve ICU resource allocation decisions and enhance the clinical relevance of AI systems in providing effective patient care. Although our primary research setting is the ICU, our graph learning model can be generalized to other healthcare contexts to accurately identify key feature interactions for prediction of other health outcomes, such as mortality, readmission risk, and hospitalizations. 
    more » « less
  3. Fast diagnostic results using breath analysis are an anticipated possibility for disease diagnosis or general health screenings. Tests that do not require sending specimens to medical laboratories possess capabilities to speed patient diagnosis and protect both patient and healthcare staff from unnecessary prolonged exposure. The objective of this work was to develop testing procedures on an initial healthy subject cohort in Hawaii to act as a range-finding pilot study for characterizing the baseline of exhaled breath prior to further research. Using comprehensive two-dimensional gas chromatography (GC×GC), this study analyzed exhaled breath from a healthy adult population in Hawaii to profile the range of different volatile organic compounds (VOCs) and survey Hawaii-specific differences. The most consistently reported compounds in the breath profile of individuals were acetic acid, dimethoxymethane, benzoic acid methyl ester, and n-hexane. In comparison to other breathprinting studies, the list of compounds discovered was representative of control cohorts. This must be considered when implementing proposed breath diagnostics in new locations with increased interpersonal variation due to diversity. Further studies on larger numbers of subjects over longer periods of time will provide additional foundational data on baseline breath VOC profiles of control populations for comparison to disease-positive cohorts. 
    more » « less
  4. With the wide application of electronic health records (EHR) in healthcare facilities, health event prediction with deep learning has gained more and more attention. A common feature of EHR data used for deep-learning-based predictions is historical diagnoses. Existing work mainly regards a diagnosis as an independent disease and does not consider clinical relations among diseases in a visit. Many machine learning approaches assume disease representations are static in different visits of a patient. However, in real practice, multiple diseases that are frequently diagnosed at the same time reflect hidden patterns that are conducive to prognosis. Moreover, the development of a disease is not static since some diseases can emerge or disappear and show various symptoms in different visits of a patient. To effectively utilize this combinational disease information and explore the dynamics of diseases, we propose a novel context-aware learning framework using transition functions on dynamic disease graphs. Specifically, we construct a global disease co-occurrence graph with multiple node properties for disease combinations. We design dynamic subgraphs for each patient's visit to leverage global and local contexts. We further define three diagnosis roles in each visit based on the variation of node properties to model disease transition processes. Experimental results on two real-world EHR datasets show that the proposed model outperforms state of the art in predicting health events. 
    more » « less
  5. Healthcare capacity shortage contributes to poor access in many countries. Moreover, rapid urbanization often occurring in these countries has exacerbated the imbalance between healthcare capacity and need. One way to address the above challenge is expanding the total capacity and redistributing the capacity spatially. In this research, we studied the problem of locating new hospitals in a two-tier outpatient care system comprising multiple central and district hospitals, and upgrading existing district hospitals to central hospitals. We formulated the problem with a discrete location optimization model. To parameterize the optimization model, we used a multinomial logit model to characterize individual patients’ diverse hospital choice and to quantify the patient arrival rates at each hospital accordingly. To solve the hard nonlinear combinatorial optimization problem, we developed a queueing network model to approximate the impact of hospital locations on patient flows. We then proposed a multi-fidelity optimization approach, which involves both the aforementioned queuing network model as a surrogate and a self-developed stochastic simulation as the high-fidelity model. With a real-world case study of Shanghai, we demonstrated the changes in the care network and examined the impacts on the network design by population center emergence, governmental budget change and considering patients with different age groups or income levels. Note to Practitioners —Our work focuses on improving system-wide care access in a two-tier care network. We believe that our work can lead to effective development of a location analytics tool for city-wide healthcare system planners. We also think the importance of this study is further strengthened by the case studies based on real-world hospital choice experimental data from Shanghai, China, a region suffering from the imbalance between healthcare capacity and need. Our case studies are expected to make recommendations on care facility expansion and dispersion to better align with the spatial distribution of residential communities and patient hospital choice behavior. 
    more » « less