skip to main content

Title: Multi‐scale, domain knowledge‐guided attention + random forest: a two‐stage deep learning‐based multi‐scale guided attention models to diagnose idiopathic pulmonary fibrosis from computed tomography images
Abstract Background

Idiopathic pulmonary fibrosis (IPF) is a progressive, irreversible, and usually fatal lung disease of unknown reasons, generally affecting the elderly population. Early diagnosis of IPF is crucial for triaging patients’ treatment planning into anti‐fibrotic treatment or treatments for other causes of pulmonary fibrosis. However, current IPF diagnosis workflow is complicated and time‐consuming, which involves collaborative efforts from radiologists, pathologists, and clinicians and it is largely subject to inter‐observer variability.


The purpose of this work is to develop a deep learning‐based automated system that can diagnose subjects with IPF among subjects with interstitial lung disease (ILD) using an axial chest computed tomography (CT) scan. This work can potentially enable timely diagnosis decisions and reduce inter‐observer variability.


Our dataset contains CT scans from 349 IPF patients and 529 non‐IPF ILD patients. We used 80% of the dataset for training and validation purposes and 20% as the holdout test set. We proposed a two‐stage model: at stage one, we built a multi‐scale, domain knowledge‐guided attention model (MSGA) that encouraged the model to focus on specific areas of interest to enhance model explainability, including both high‐ and medium‐resolution attentions; at stage two, we collected the output from MSGA and constructed a random forest (RF) classifier for patient‐level diagnosis, to further boost model accuracy. RF classifier is utilized as a final decision stage since it is interpretable, computationally fast, and can handle correlated variables. Model utility was examined by (1) accuracy, represented by the area under the receiver operating characteristic curve (AUC) with standard deviation (SD), and (2) explainability, illustrated by the visual examination of the estimated attention maps which showed the important areas for model diagnostics.


During the training and validation stage, we observe that when we provide no guidance from domain knowledge, the IPF diagnosis model reaches acceptable performance (AUC±SD = 0.93±0.07), but lacks explainability; when including only guided high‐ or medium‐resolution attention, the learned attention maps are not satisfactory; when including both high‐ and medium‐resolution attention, under certain hyperparameter settings, the model reaches the highest AUC among all experiments (AUC±SD = 0.99±0.01) and the estimated attention maps concentrate on the regions of interests for this task. Three best‐performing hyperparameter selections according to MSGA were applied to the holdout test set and reached comparable model performance to that of the validation set.


Our results suggest that, for a task with only scan‐level labels available, MSGA+RF can utilize the population‐level domain knowledge to guide the training of the network, which increases both model accuracy and explainability.

more » « less
Award ID(s):
2205441 2054253
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Medical Physics
Page Range / eLocation ID:
p. 894-905
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Background

    Lung cancer is the deadliest and second most common cancer in the United States due to the lack of symptoms for early diagnosis. Pulmonary nodules are small abnormal regions that can be potentially correlated to the occurrence of lung cancer. Early detection of these nodules is critical because it can significantly improve the patient's survival rates. Thoracic thin‐sliced computed tomography (CT) scanning has emerged as a widely used method for diagnosing and prognosis lung abnormalities.


    The standard clinical workflow of detecting pulmonary nodules relies on radiologists to analyze CT images to assess the risk factors of cancerous nodules. However, this approach can be error‐prone due to the various nodule formation causes, such as pollutants and infections. Deep learning (DL) algorithms have recently demonstrated remarkable success in medical image classification and segmentation. As an ever more important assistant to radiologists in nodule detection, it is imperative ensure the DL algorithm and radiologist to better understand the decisions from each other. This study aims to develop a framework integrating explainable AI methods to achieve accurate pulmonary nodule detection.


    A robust and explainable detection (RXD) framework is proposed, focusing on reducing false positives in pulmonary nodule detection. Its implementation is based on an explanation supervision method, which uses nodule contours of radiologists as supervision signals to force the model to learn nodule morphologies, enabling improved learning ability on small dataset, and enable small dataset learning ability. In addition, two imputation methods are applied to the nodule region annotations to reduce the noise within human annotations and allow the model to have robust attributions that meet human expectations. The 480, 265, and 265 CT image sets from the public Lung Image Database Consortium and Image Database Resource Initiative (LIDC‐IDRI) dataset are used for training, validation, and testing.


    Using only 10, 30, 50, and 100 training samples sequentially, our method constantly improves the classification performance and explanation quality of baseline in terms of Area Under the Curve (AUC) and Intersection over Union (IoU). In particular, our framework with a learnable imputation kernel improves IoU from baseline by 24.0% to 80.0%. A pre‐defined Gaussian imputation kernel achieves an even greater improvement, from 38.4% to 118.8% from baseline. Compared to the baseline trained on 100 samples, our method shows less drop in AUC when trained on fewer samples. A comprehensive comparison of interpretability shows that our method aligns better with expert opinions.


    A pulmonary nodule detection framework was demonstrated using public thoracic CT image datasets. The framework integrates the robust explanation supervision (RES) technique to ensure the performance of nodule classification and morphology. The method can reduce the workload of radiologists and enable them to focus on the diagnosis and prognosis of the potential cancerous pulmonary nodules at the early stage to improve the outcomes for lung cancer patients.

    more » « less
  2. Abstract

    Cystic fibrosis (CF) is characterized by chronic respiratory infections which progressively decrease lung function over time. Affected individuals experience episodes of intensified respiratory symptoms called pulmonary exacerbations (PEx), which in turn accelerate pulmonary function decline and decrease survival rate. An overarching challenge is that there is no standard classification for PEx, which results in treatments that are heterogeneous. Improving PEx classification and management is a significant research priority for people with CF. Previous studies have shown volatile organic compounds (VOCs) in exhaled breath can be used as biomarkers because they are products of metabolic pathways dysregulated by different diseases. To provide insights on PEx classification and other CF clinical factors, exhaled breath samples were collected from 18 subjects with CF, with some experiencing PEx and others serving as a baseline. Exhaled breath was collected in Tedlar bags during tidal breathing and cryotransferred to headspace vials for VOC analysis by solid phase microextraction coupled to gas chromatography–mass spectrometry. Statistical significance testing between quantitative and categorical clinical variables displayed percent-predicted forced expiratory volume in one second (FEV1pp) was decreased in subjects experiencing PEx. VOCs correlating with other clinical variables (body mass index, age, use of highly effective modulator treatment (HEMT), and the need for inhaled tobramycin) were also explored. Two volatile aldehydes (octanal and nonanal) were upregulated in patients not taking the HEMT. VOCs correlating to potential confounding variables were removed and then analyzed by regression for significant correlations with FEV1pp measurements. Interestingly, the VOC with the highest correlation with FEV1pp (3,7-dimethyldecane) also gave the lowestp-value when comparing subjects at baseline and during PEx. Other VOCs that were differentially expressed due to PEx that were identified in this study include durene, 2,4,4-trimethyl-1,3-pentanediol 1-isobutyrate and 5-methyltridecane. Receiver operator characteristic curves were developed and showed 3,7-dimethyldecane had higher ability to classify PEx (area under the curve (AUC) = 0.91) relative to FEV1pp values at collection (AUC = 0.83). However, normalized ΔFEV1pp values had the highest capability to distinguish PEx (AUC = 0.93). These results show that VOCs in exhaled breath may be a rich source of biomarkers for various clinical traits of CF, including PEx, that should be explored in larger sample cohorts and validation studies.

    more » « less
  3. Interstitial lung disease (ILD) causes pulmonary fibrosis. The correct classification of ILD plays a crucial role in the diagnosis and treatment process. In this research work, we propose a lung nodules recognition method based on a deep convolutional neural network (DCNN) and global features, which can be used for computer-aided diagnosis (CAD) of global features of lung nodules. Firstly, a DCNN is constructed based on the characteristics and complexity of lung computerized tomography (CT) images. Then we discussed the effects of different iterations on the recognition results and influence of different model structures on the global features of lung nodules. We also incorporated the improvement of convolution kernel size, feature dimension, and network depth. Thirdly, the effects of different pooling methods, activation functions and training algorithms we proposed has been analyzed to demonstrate the advantages of the new strategy. Finally, the experimental results verify the feasibility of the proposed DCNN for CAD of global features of lung nodules, and the evaluation shown that our proposed method could achieve an outstanding results compare to state-of-arts. 
    more » « less
  4. Abstract

    The airway epithelium serves as the interface between the host and external environment. In many chronic lung diseases, the airway is the site of substantial remodeling after injury. While, idiopathic pulmonary fibrosis (IPF) has traditionally been considered a disease of the alveolus and lung matrix, the dominant environmental (cigarette smoking) and genetic (gain of functionMUC5Bpromoter variant) risk factor primarily affect the distal airway epithelium. Moreover, airway-specific pathogenic features of IPF include bronchiolization of the distal airspace with abnormal airway cell-types and honeycomb cystic terminal airway-like structures with concurrent loss of terminal bronchioles in regions of minimal fibrosis. However, the pathogenic role of the airway epithelium in IPF is unknown. Combining biophysical, genetic, and signaling analyses of primary airway epithelial cells, we demonstrate that healthy and IPF airway epithelia are biophysically distinct, identifying pathologic activation of the ERBB-YAP axis as a specific and modifiable driver of prolongation of the unjammed-to-jammed transition in IPF epithelia. Furthermore, we demonstrate that this biophysical state and signaling axis correlates with epithelial-driven activation of the underlying mesenchyme. Our data illustrate the active mechanisms regulating airway epithelial-driven fibrosis and identify targets to modulate disease progression.

    more » « less
  5. Abstract

    Breast carcinoma is the most common cancer among women worldwide that consists of a heterogeneous group of subtype diseases. The whole-slide images (WSIs) can capture the cell-level heterogeneity, and are routinely used for cancer diagnosis by pathologists. However, key driver genetic mutations related to targeted therapies are identified by genomic analysis like high-throughput molecular profiling. In this study, we develop a deep-learning model to predict the genetic mutations and biological pathway activities directly from WSIs. Our study offers unique insights into WSI visual interactions between mutation and its related pathway, enabling a head-to-head comparison to reinforce our major findings. Using the histopathology images from the Genomic Data Commons Database, our model can predict the point mutations of six important genes (AUC 0.68–0.85) and copy number alteration of another six genes (AUC 0.69–0.79). Additionally, the trained models can predict the activities of three out of ten canonical pathways (AUC 0.65–0.79). Next, we visualized the weight maps of tumor tiles in WSI to understand the decision-making process of deep-learning models via a self-attention mechanism. We further validated our models on liver and lung cancers that are related to metastatic breast cancer. Our results provide insights into the association between pathological image features, molecular outcomes, and targeted therapies for breast cancer patients.

    more » « less