skip to main content


Title: Using Causal Analysis for Conceptual Deep Learning Explanation
Model explainability is essential for the creation of trustworthy Machine Learning models in healthcare. An ideal explanation resembles the decision-making process of a domain expert and is expressed using concepts or terminology that is meaningful to the clinicians. To provide such explanation, we first associate the hidden units of the classifier to clinically relevant concepts. We take advantage of radiology reports accompanying the chest X-ray images to define concepts. We discover sparse associations between concepts and hidden units using a linear sparse logistic regression. To ensure that the identified units truly influence the classifier’s outcome, we adopt tools from Causal Inference literature and, more specifically, mediation analysis through counterfactual interventions. Finally, we construct a low-depth decision tree to translate all the discovered concepts into a straightforward decision rule, expressed to the radiologist. We evaluated our approach on a large chest x-ray dataset, where our model produces a global explanation consistent with clinical knowledge.  more » « less
Award ID(s):
1839332
NSF-PAR ID:
10299285
Author(s) / Creator(s):
Date Published:
Journal Name:
International Conference on Medical Image Computing and Computer-Assisted Intervention
Page Range / eLocation ID:
519-528
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    During the coronavirus disease 2019 (COVID-19) pandemic, rapid and accurate triage of patients at the emergency department is critical to inform decision-making. We propose a data-driven approach for automatic prediction of deterioration risk using a deep neural network that learns from chest X-ray images and a gradient boosting model that learns from routine clinical variables. Our AI prognosis system, trained using data from 3661 patients, achieves an area under the receiver operating characteristic curve (AUC) of 0.786 (95% CI: 0.745–0.830) when predicting deterioration within 96 hours. The deep neural network extracts informative areas of chest X-ray images to assist clinicians in interpreting the predictions and performs comparably to two radiologists in a reader study. In order to verify performance in a real clinical setting, we silently deployed a preliminary version of the deep neural network at New York University Langone Health during the first wave of the pandemic, which produced accurate predictions in real-time. In summary, our findings demonstrate the potential of the proposed system for assisting front-line physicians in the triage of COVID-19 patients.

     
    more » « less
  2. Purpose Prior studies show convolutional neural networks predicting self-reported race using x-rays of chest, hand and spine, chest computed tomography, and mammogram. We seek an understanding of the mechanism that reveals race within x-ray images, investigating the possibility that race is not predicted using the physical structure in x-ray images but is embedded in the grayscale pixel intensities. Approach Retrospective full year 2021, 298,827 AP/PA chest x-ray images from 3 academic health centers across the United States and MIMIC-CXR, labeled by self-reported race, were used in this study. The image structure is removed by summing the number of each grayscale value and scaling to percent per image (PPI). The resulting data are tested using multivariate analysis of variance (MANOVA) with Bonferroni multiple-comparison adjustment and class-balanced MANOVA. Machine learning (ML) feed-forward networks (FFN) and decision trees were built to predict race (binary Black or White and binary Black or other) using only grayscale value counts. Stratified analysis by body mass index, age, sex, gender, patient type, make/model of scanner, exposure, and kilovoltage peak setting was run to study the impact of these factors on race prediction following the same methodology. Results MANOVA rejects the null hypothesis that classes are the same with 95% confidence (F 7.38, P < 0.0001) and balanced MANOVA (F 2.02, P < 0.0001). The best FFN performance is limited [area under the receiver operating characteristic (AUROC) of 69.18%]. Gradient boosted trees predict self-reported race using grayscale PPI (AUROC 77.24%). Conclusions Within chest x-rays, pixel intensity value counts alone are statistically significant indicators and enough for ML classification tasks of patient self-reported race. 
    more » « less
  3. Manual examination of chest x-rays is a time consuming process that involves significant effort by expert radiologists. Recent work attempts to alleviate this problem by developing learning-based automated chest x-ray analysis systems that map images to multi-label diagnoses using deep neural net- works. These methods are often treated as black boxes, or they output attention maps but don’t explain why the attended areas are important. Given data consisting of a frontal-view x-ray, a set of natural language findings, and one or more diagnostic impressions, we propose a deep neural network model that during training simultaneously 1) constructs a topic model which clusters key terms from the findings into meaningful groups, 2) predicts the presence of each topic for a given input image based on learned visual features, and 3) uses an image’s predicted topic encoding as features to predict one or more diagnoses. Since the net learns the topic model jointly with the classifier, it gives us a powerful tool for understanding which semantic concepts the net might be ex- ploiting when making diagnoses, and since we constrain the net to predict topics based on expert-annotated reports, the net automatically encodes some higher-level expert knowledge about how to make diagnoses. 
    more » « less
  4. End-to-end deep learning models are increasingly applied to safety-critical human activity recognition (HAR) applications, e.g., healthcare monitoring and smart home control, to reduce developer burden and increase the performance and robustness of prediction models. However, integrating HAR models in safety-critical applications requires trust, and recent approaches have aimed to balance the performance of deep learning models with explainable decision-making for complex activity recognition. Prior works have exploited the compositionality of complex HAR (i.e., higher-level activities composed of lower-level activities) to form models with symbolic interfaces, such as concept-bottleneck architectures, that facilitate inherently interpretable models. However, feature engineering for symbolic concepts-as well as the relationship between the concepts-requires precise annotation of lower-level activities by domain experts, usually with fixed time windows, all of which induce a heavy and error-prone workload on the domain expert. In this paper, we introduce X-CHAR, an eXplainable Complex Human Activity Recognition model that doesn't require precise annotation of low-level activities, offers explanations in the form of human-understandable, high-level concepts, while maintaining the robust performance of end-to-end deep learning models for time series data. X-CHAR learns to model complex activity recognition in the form of a sequence of concepts. For each classification, X-CHAR outputs a sequence of concepts and a counterfactual example as the explanation. We show that the sequence information of the concepts can be modeled using Connectionist Temporal Classification (CTC) loss without having accurate start and end times of low-level annotations in the training dataset-significantly reducing developer burden. We evaluate our model on several complex activity datasets and demonstrate that our model offers explanations without compromising the prediction accuracy in comparison to baseline models. Finally, we conducted a mechanical Turk study to show that the explanations provided by our model are more understandable than the explanations from existing methods for complex activity recognition. 
    more » « less
  5. null (Ed.)
    Coronavirus Disease 2019 (COVID-19) is caused by severe acute respiratory syndrome coronavirus 2 virus (SARS-CoV-2). The virus transmits rapidly; it has a basic reproductive number (R0) of 2.2-2.7. In March 2020, the World Health Organization declared the COVID-19 outbreak a pandemic. COVID-19 is currently affecting more than 200 countries with 6M active cases. An effective testing strategy for COVID-19 is crucial to controlling the outbreak but the demand for testing surpasses the availability of test kits that use Reverse Transcription Polymerase Chain Reaction (RT-PCR). In this paper, we present a technique to screen for COVID-19 using artificial intelligence. Our technique takes only seconds to screen for the presence of the virus in a patient. We collected a dataset of chest X-ray images and trained several popular deep convolution neural network-based models (VGG, MobileNet, Xception, DenseNet, InceptionResNet) to classify the chest X-rays. Unsatisfied with these models, we then designed and built a Residual Attention Network that was able to screen COVID-19 with a testing accuracy of 98% and a validation accuracy of 100%. A feature maps visual of our model show areas in a chest X-ray which are important for classification. Our work can help to increase the adaptation of AI-assisted applications in clinical practice. The code and dataset used in this project are available at https://github.com/vishalshar/covid-19-screening-using-RAN-on-X-ray-images. 
    more » « less