skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Robust and accurate pulmonary nodule detection with self-supervised feature learning on domain adaptation
Medical imaging data annotation is expensive and time-consuming. Supervised deep learning approaches may encounter overfitting if trained with limited medical data, and further affect the robustness of computer-aided diagnosis (CAD) on CT scans collected by various scanner vendors. Additionally, the high false-positive rate in automatic lung nodule detection methods prevents their applications in daily clinical routine diagnosis. To tackle these issues, we first introduce a novel self-learning schema to train a pre-trained model by learning rich feature representatives from large-scale unlabeled data without extra annotation, which guarantees a consistent detection performance over novel datasets. Then, a 3D feature pyramid network ( 3DFPN ) is proposed for high-sensitivity nodule detection by extracting multi-scale features, where the weights of the backbone network are initialized by the pre-trained model and then fine-tuned in a supervised manner. Further, a High Sensitivity and Specificity ( HS 2 ) network is proposed to reduce false positives by tracking the appearance changes among continuous CT slices on Location History Images (LHI) for the detected nodule candidates. The proposed method’s performance and robustness are evaluated on several publicly available datasets, including LUNA16, SPIE-AAPM, LungTIME, and HMS. Our proposed detector achieves the state-of-the-art result of 90.6 % sensitivity at 1 / 8 false positive per scan on the LUNA16 dataset. The proposed framework’s generalizability has been evaluated on three additional datasets (i.e., SPIE-AAPM, LungTIME, and HMS) captured by different types of CT scanners.  more » « less
Award ID(s):
2041307
PAR ID:
10428808
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Frontiers in Radiology
Volume:
2
ISSN:
2673-8740
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Purpose This article introduces a novel deep learning approach to substantially improve the accuracy of colon segmentation even with limited data annotation, which enhances the overall effectiveness of the CT colonography pipeline in clinical settings. Methods The proposed approach integrates 3D contextual information via guided sequential episodic training in which a query CT slice is segmented by exploiting its previous labeled CT slice (i.e., support). Segmentation starts by detecting the rectum using a Markov Random Field-based algorithm. Then, supervised sequential episodic training is applied to the remaining slices, while contrastive learning is employed to enhance feature discriminability, thereby improving segmentation accuracy. Results The proposed method, evaluated on 98 abdominal scans of prepped patients, achieved a Dice coefficient of 97.3% and a polyp information preservation accuracy of 98.28%. Statistical analysis, including 95% confidence intervals, underscores the method’s robustness and reliability. Clinically, this high level of accuracy is vital for ensuring the preservation of critical polyp details, which are essential for accurate automatic diagnostic evaluation. The proposed method performs reliably in scenarios with limited annotated data. This is demonstrated by achieving a Dice coefficient of 97.15% when the model was trained on a smaller number of annotated CT scans (e.g., 10 scans) than the testing dataset (e.g., 88 scans). Conclusions The proposed sequential segmentation approach achieves promising results in colon segmentation. A key strength of the method is its ability to generalize effectively, even with limited annotated datasets—a common challenge in medical imaging. 
    more » « less
  2. Abstract BackgroundLung cancer is the deadliest and second most common cancer in the United States due to the lack of symptoms for early diagnosis. Pulmonary nodules are small abnormal regions that can be potentially correlated to the occurrence of lung cancer. Early detection of these nodules is critical because it can significantly improve the patient's survival rates. Thoracic thin‐sliced computed tomography (CT) scanning has emerged as a widely used method for diagnosing and prognosis lung abnormalities. PurposeThe standard clinical workflow of detecting pulmonary nodules relies on radiologists to analyze CT images to assess the risk factors of cancerous nodules. However, this approach can be error‐prone due to the various nodule formation causes, such as pollutants and infections. Deep learning (DL) algorithms have recently demonstrated remarkable success in medical image classification and segmentation. As an ever more important assistant to radiologists in nodule detection, it is imperative ensure the DL algorithm and radiologist to better understand the decisions from each other. This study aims to develop a framework integrating explainable AI methods to achieve accurate pulmonary nodule detection. MethodsA robust and explainable detection (RXD) framework is proposed, focusing on reducing false positives in pulmonary nodule detection. Its implementation is based on an explanation supervision method, which uses nodule contours of radiologists as supervision signals to force the model to learn nodule morphologies, enabling improved learning ability on small dataset, and enable small dataset learning ability. In addition, two imputation methods are applied to the nodule region annotations to reduce the noise within human annotations and allow the model to have robust attributions that meet human expectations. The 480, 265, and 265 CT image sets from the public Lung Image Database Consortium and Image Database Resource Initiative (LIDC‐IDRI) dataset are used for training, validation, and testing. ResultsUsing only 10, 30, 50, and 100 training samples sequentially, our method constantly improves the classification performance and explanation quality of baseline in terms of Area Under the Curve (AUC) and Intersection over Union (IoU). In particular, our framework with a learnable imputation kernel improves IoU from baseline by 24.0% to 80.0%. A pre‐defined Gaussian imputation kernel achieves an even greater improvement, from 38.4% to 118.8% from baseline. Compared to the baseline trained on 100 samples, our method shows less drop in AUC when trained on fewer samples. A comprehensive comparison of interpretability shows that our method aligns better with expert opinions. ConclusionsA pulmonary nodule detection framework was demonstrated using public thoracic CT image datasets. The framework integrates the robust explanation supervision (RES) technique to ensure the performance of nodule classification and morphology. The method can reduce the workload of radiologists and enable them to focus on the diagnosis and prognosis of the potential cancerous pulmonary nodules at the early stage to improve the outcomes for lung cancer patients. 
    more » « less
  3. Early intervention in kidney cancer helps to improve survival rates. Abdominal computed tomography (CT) is often used to diagnose renal masses. In clinical practice, the manual segmentation and quantification of organs and tumors are expensive and time-consuming. Artificial intelligence (AI) has shown a significant advantage in assisting cancer diagnosis. To reduce the workload of manual segmentation and avoid unnecessary biopsies or surgeries, in this paper, we propose a novel end-to-end AI-driven automatic kidney and renal mass diagnosis framework to identify the abnormal areas of the kidney and diagnose the histological subtypes of renal cell carcinoma (RCC). The proposed framework first segments the kidney and renal mass regions by a 3D deep learning architecture (Res-UNet), followed by a dual-path classification network utilizing local and global features for the subtype prediction of the most common RCCs: clear cell, chromophobe, oncocytoma, papillary, and other RCC subtypes. To improve the robustness of the proposed framework on the dataset collected from various institutions, a weakly supervised learning schema is proposed to leverage the domain gap between various vendors via very few CT slice annotations. Our proposed diagnosis system can accurately segment the kidney and renal mass regions and predict tumor subtypes, outperforming existing methods on the KiTs19 dataset. Furthermore, cross-dataset validation results demonstrate the robustness of datasets collected from different institutions trained via the weakly supervised learning schema. 
    more » « less
  4. Purpose: Limited studies exploring concrete methods or approaches to tackle and enhance model fairness in the radiology domain. Our proposed AI model utilizes supervised contrastive learning to minimize bias in CXR diagnosis. Materials and Methods: In this retrospective study, we evaluated our proposed method on two datasets: the Medical Imaging and Data Resource Center (MIDRC) dataset with 77,887 CXR images from 27,796 patients collected as of April 20, 2023 for COVID-19 diagnosis, and the NIH Chest X-ray (NIH-CXR) dataset with 112,120 CXR images from 30,805 patients collected between 1992 and 2015. In the NIH-CXR dataset, thoracic abnormalities include atelectasis, cardiomegaly, effusion, infiltration, mass, nodule, pneumonia, pneumothorax, consolidation, edema, emphysema, fibrosis, pleural thickening, or hernia. Our proposed method utilizes supervised contrastive learning with carefully selected positive and negative samples to generate fair image embeddings, which are fine-tuned for subsequent tasks to reduce bias in chest X-ray (CXR) diagnosis. We evaluated the methods using the marginal AUC difference (δ mAUC). Results: The proposed model showed a significant decrease in bias across all subgroups when compared to the baseline models, as evidenced by a paired T-test (p<0.0001). The δ mAUC obtained by our method were 0.0116 (95\% CI, 0.0110-0.0123), 0.2102 (95% CI, 0.2087-0.2118), and 0.1000 (95\% CI, 0.0988-0.1011) for sex, race, and age on MIDRC, and 0.0090 (95\% CI, 0.0082-0.0097) for sex and 0.0512 (95% CI, 0.0512-0.0532) for age on NIH-CXR, respectively. Conclusion: Employing supervised contrastive learning can mitigate bias in CXR diagnosis, addressing concerns of fairness and reliability in deep learning-based diagnostic methods. 
    more » « less
  5. Artificial intelligence-based prostate cancer (PCa) detection models have been widely explored to assist clinical diagnosis. However, these trained models may generate erroneous results specifically on datasets that are not within training distribution. In this paper, we propose an approach to tackle this so-called out-of-distribution (OOD) data problem. Specifically, we devise an end-to-end unsupervised framework to estimate uncertainty values for cases analyzed by a previously trained PCa detection model. Our PCa detection model takes the inputs of bpMRI scans and through our proposed approach we identify OOD cases that are likely to generate degraded performance due to the data distribution shifts. The proposed OOD framework consists of two parts. First, an autoencoder-based reconstruction network is proposed, which learns discrete latent representations of in-distribution data. Second, the uncertainty is computed using perceptual loss that measures the distance between original and reconstructed images in the feature space of a pre-trained PCa detection network. The effectiveness of the proposed framework is evaluated on seven independent data collections with a total of 1,432 cases. The performance of pre-trained PCa detection model is significantly improved by excluding cases with high uncertainty. 
    more » « less