skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: HIST-AID: Leveraging Historical Patient Reports for Enhanced Multi-Modal Automatic Diagnosis
Chest X-ray imaging is a widely accessible and non-invasive diagnostic tool for detecting thoracic abnormalities. While numerous AI models assist radiologists in interpreting these images, most overlook patients' historical data. To bridge this gap, we introduce Temporal MIMIC dataset, which integrates five years of patient history, including radiographic scans and reports from MIMIC-CXR and MIMIC-IV, encompassing 12,221 patients and thirteen pathologies. Building on this, we present HIST-AID, a framework that enhances automatic diagnostic accuracy using historical reports. HIST-AID emulates the radiologist's comprehensive approach, leveraging historical data to improve diagnostic accuracy. Our experiments demonstrate significant improvements, with AUROC increasing by 6.56% and AUPRC by 9.51% compared to models that rely solely on radiographic scans. These gains were consistently observed across diverse demographic groups, including variations in gender, age, and racial categories. We show that while recent data boost performance, older data may reduce accuracy due to changes in patient conditions. Our work paves the potential of incorporating historical data for more reliable automatic diagnosis, providing critical support for clinical decision-making.  more » « less
Award ID(s):
1922658
PAR ID:
10649479
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Proceedings of Machine Learning Research (Volume 259, 2024)
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Introduction: Increased use of telemedicine could potentially streamline influenza diagnosis and reduce transmission. However, telemedicine diagnoses are dependent on accurate symptom reporting by patients. If patients disagree with clinicians on symptoms, previously derived diagnostic rules may be inaccurate. Methods: We performed a secondary data analysis of a prospective, nonrandomized cohort study at a university student health center. Patients who reported an upper respiratory complaint were required to report symptoms, and their clinician was required to report the same list of symptoms. We examined the performance of 5 previously developed clinical decision rules (CDRs) for influenza on both symptom reports. These predictions were compared against PCR diagnoses. We analyzed the agreement between symptom reports, and we built new predictive models using both sets of data. Results: CDR performance was always lower for the patient-reported symptom data, compared with clinician-reported symptom data. CDRs often resulted in different predictions for the same individual, driven by disagreement in symptom reporting. We were able to fit new models to the patient-reported data, which performed slightly worse than previously derived CDRs. These models and models built on clinician-reported data both suffered from calibration issues. Discussion: Patients and clinicians frequently disagree about symptom presence, which leads to reduced accuracy when CDRs built with clinician data are applied to patient-reported symptoms. Predictive models using patient-reported symptom data performed worse than models using clinicianreported data and prior results in the literature. However, the differences are minor, and developing new models with more data may be possible. ( J Am Board Fam Med 2023;00:000–000.) 
    more » « less
  2. Abstract Purpose This article introduces a novel deep learning approach to substantially improve the accuracy of colon segmentation even with limited data annotation, which enhances the overall effectiveness of the CT colonography pipeline in clinical settings. Methods The proposed approach integrates 3D contextual information via guided sequential episodic training in which a query CT slice is segmented by exploiting its previous labeled CT slice (i.e., support). Segmentation starts by detecting the rectum using a Markov Random Field-based algorithm. Then, supervised sequential episodic training is applied to the remaining slices, while contrastive learning is employed to enhance feature discriminability, thereby improving segmentation accuracy. Results The proposed method, evaluated on 98 abdominal scans of prepped patients, achieved a Dice coefficient of 97.3% and a polyp information preservation accuracy of 98.28%. Statistical analysis, including 95% confidence intervals, underscores the method’s robustness and reliability. Clinically, this high level of accuracy is vital for ensuring the preservation of critical polyp details, which are essential for accurate automatic diagnostic evaluation. The proposed method performs reliably in scenarios with limited annotated data. This is demonstrated by achieving a Dice coefficient of 97.15% when the model was trained on a smaller number of annotated CT scans (e.g., 10 scans) than the testing dataset (e.g., 88 scans). Conclusions The proposed sequential segmentation approach achieves promising results in colon segmentation. A key strength of the method is its ability to generalize effectively, even with limited annotated datasets—a common challenge in medical imaging. 
    more » « less
  3. Doshi-Velez, Finale; Fackler, Jim; Jung, Ken; Kale, David; Ranganath, Rajesh; Wallace, Byron; Wiens, Jenna (Ed.)
    Electronic Health Records (EHRs) provide vital contextual information to radiologists and other physicians when making a diagnosis. Unfortunately, because a given patient’s record may contain hundreds of notes and reports, identifying relevant information within these in the short time typically allotted to a case is very difficult. We propose and evaluate models that extract relevant text snippets from patient records to provide a rough case summary intended to aid physicians considering one or more diagnoses. This is hard because direct supervision (i.e., physician annotations of snippets relevant to specific diagnoses in medical records) is prohibitively expensive to collect at scale. We propose a distantly supervised strategy in which we use groups of International Classification of Diseases (ICD) codes observed in ‘future’ records as noisy proxies for ‘downstream’ diagnoses. Using this we train a transformer-based neural model to perform extractive summarization conditioned on potential diagnoses. This model defines an attention mechanism that is conditioned on potential diagnoses (queries) provided by the diagnosing physician. We train (via distant supervision) and evaluate variants of this model on EHR data from Brigham and Women’s Hospital in Boston and MIMIC-III (the latter to facilitate reproducibility). Evaluations performed by radiologists demonstrate that these distantly supervised models yield better extractive summaries than do unsupervised approaches. Such models may aid diagnosis by identifying sentences in past patient reports that are clinically relevant to a potential diagnosis. Code is available at https://github.com/dmcinerney/ehr-extraction-models. 
    more » « less
  4. null (Ed.)
    OBJECTIVES: Prediction and determination of drug efficacy for radiographic progression is limited by the heterogeneity inherent in axial spondyloarthritis (axSpA). We investigated whether unbiased clustering analysis of phenotypic data can lead to coherent subgroups of axSpA patients with a distinct risk of radiographic progression. METHODS: A group of 412 patients with axSpA was clustered in an unbiased way using a agglomerative hierarchical clustering method, based on their phenotype mapping. We used a generalised linear model, naïve Bayes, Decision Trees, K-Nearest-Neighbors, and Support Vector Machines to construct a consensus classification method. Radiographic progression over 2 years was assessed using the modified Stoke Ankylosing Spondylitis Spine Score (mSASSS). RESULTS: axSpA patients were classified into three distinct subgroups with distinct clinical characteristics. Sex, smoking, HLA-B27, baseline mSASSS, uveitis, and peripheral arthritis were the key features that were found to stratifying the phenogroups. The three phenogroups showed distinct differences in radiographic progression rate (p<0.05) and the proportion of progressors (p<0.001). Phenogroup 2, consisting of male smokers, had the worst radiographic progression, while phenogroup 3, exclusively suffering from uveitis, showed the least radiographic progression. The axSpA phenogroup classification, including its ability to stratify risk, was successfully replicated in an independent validation group. CONCLUSIONS: Phenotype mapping results in a clinically relevant classification of axSpA that is applicable for risk stratification. Novel coupling between phenotypic features and radiographic progression can provide a glimpse into the mechanisms underlying divergent and shared features of axSpA. 
    more » « less
  5. null (Ed.)
    Central nervous system (CNS) tumors come with vastly heterogeneous histologic, molecular, and radiographic landscapes, rendering their precise characterization challenging. The rapidly growing fields of biophysical modeling and radiomics have shown promise in better characterizing the molecular, spatial, and temporal heterogeneity of tumors. Integrative analysis of CNS tumors, including clinically acquired multi-parametric magnetic resonance imaging (mpMRI) and the inverse problem of calibrating biophysical models to mpMRI data, assists in identifying macroscopic quantifiable tumor patterns of invasion and proliferation, potentially leading to improved ( a) detection/segmentation of tumor subregions and ( b) computer-aided diagnostic/prognostic/predictive modeling. This article presents a summary of ( a) biophysical growth modeling and simulation,( b) inverse problems for model calibration, ( c) these models' integration with imaging workflows, and ( d) their application to clinically relevant studies. We anticipate that such quantitative integrative analysis may even be beneficial in a future revision of the World Health Organization (WHO) classification for CNS tumors, ultimately improving patient survival prospects. 
    more » « less