skip to main content

Title: Organ localization and identification in thoracic CT volumes using 3D CNNs leveraging spatial anatomic relations
In this paper, we present a model to obtain prior knowledge for organ localization in CT thorax images using three dimensional convolutional neural networks (3D CNNs). Specifically, we use the knowledge obtained from CNNs in a Bayesian detector to establish the presence and location of a given target organ defined within a spherical coordinate system. We train a CNN to perform a soft detection of the target organ potentially present at any point, x = [r,Θ,Φ]T. This probability outcome is used as a prior in a Bayesian model whose posterior probability serves to provide a more accurate solution to the target organ detection problem. The likelihoods for the Bayesian model are obtained by performing a spatial analysis of the organs in annotated training volumes. Thoracic CT images from the NSCLC–Radiomics dataset are used in our case study, which demonstrates the enhancement in robustness and accuracy of organ identification. The average value of the detector accuracies for the right lung, left lung, and heart were found to be 94.87%, 95.37%, and 90.76% after the CNN stage, respectively. Introduction of spatial relationship using a Bayes classifier improved the detector accuracies to 95.14%, 96.20%, and 95.15%, respectively, showing a marked improvement in heart detection. This workflow improves the detection rate since the decision is made employing both lower level features (edges, contour etc) and complex higher level features (spatial relationship between organs). This strategy also presents a new application to CNNs and a novel methodology to introduce higher level context features like spatial relationship between objects present at a different location in images to real world object detection problems.  more » « less
Award ID(s):
1642380 1553436
Author(s) / Creator(s):
Date Published:
Journal Name:
Proc. SPIE Medical Imaging
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Non-small-cell lung cancer (NSCLC) represents approximately 80–85% of lung cancer diagnoses and is the leading cause of cancer-related death worldwide. Recent studies indicate that image-based radiomics features from positron emission tomography/computed tomography (PET/CT) images have predictive power for NSCLC outcomes. To this end, easily calculated functional features such as the maximum and the mean of standard uptake value (SUV) and total lesion glycolysis (TLG) are most commonly used for NSCLC prognostication, but their prognostic value remains controversial. Meanwhile, convolutional neural networks (CNN) are rapidly emerging as a new method for cancer image analysis, with significantly enhanced predictive power compared to hand-crafted radiomics features. Here we show that CNNs trained to perform the tumor segmentation task, with no other information than physician contours, identify a rich set of survival-related image features with remarkable prognostic value. In a retrospective study on pre-treatment PET-CT images of 96 NSCLC patients before stereotactic-body radiotherapy (SBRT), we found that the CNN segmentation algorithm (U-Net) trained for tumor segmentation in PET and CT images, contained features having strong correlation with 2- and 5-year overall and disease-specific survivals. The U-Net algorithm has not seen any other clinical information (e.g. survival, age, smoking history, etc.) than the images and the corresponding tumor contours provided by physicians. In addition, we observed the same trend by validating the U-Net features against an extramural data set provided by Stanford Cancer Institute. Furthermore, through visualization of the U-Net, we also found convincing evidence that the regions of metastasis and recurrence appear to match with the regions where the U-Net features identified patterns that predicted higher likelihoods of death. We anticipate our findings will be a starting point for more sophisticated non-intrusive patient specific cancer prognosis determination. For example, the deep learned PET/CT features can not only predict survival but also visualize high-risk regions within or adjacent to the primary tumor and hence potentially impact therapeutic outcomes by optimal selection of therapeutic strategy or first-line therapy adjustment. 
    more » « less
  2. Medical imaging data annotation is expensive and time-consuming. Supervised deep learning approaches may encounter overfitting if trained with limited medical data, and further affect the robustness of computer-aided diagnosis (CAD) on CT scans collected by various scanner vendors. Additionally, the high false-positive rate in automatic lung nodule detection methods prevents their applications in daily clinical routine diagnosis. To tackle these issues, we first introduce a novel self-learning schema to train a pre-trained model by learning rich feature representatives from large-scale unlabeled data without extra annotation, which guarantees a consistent detection performance over novel datasets. Then, a 3D feature pyramid network ( 3DFPN ) is proposed for high-sensitivity nodule detection by extracting multi-scale features, where the weights of the backbone network are initialized by the pre-trained model and then fine-tuned in a supervised manner. Further, a High Sensitivity and Specificity ( HS 2 ) network is proposed to reduce false positives by tracking the appearance changes among continuous CT slices on Location History Images (LHI) for the detected nodule candidates. The proposed method’s performance and robustness are evaluated on several publicly available datasets, including LUNA16, SPIE-AAPM, LungTIME, and HMS. Our proposed detector achieves the state-of-the-art result of 90.6 % sensitivity at 1 / 8 false positive per scan on the LUNA16 dataset. The proposed framework’s generalizability has been evaluated on three additional datasets (i.e., SPIE-AAPM, LungTIME, and HMS) captured by different types of CT scanners. 
    more » « less
  3. Cloud detection is an inextricable pre-processing step in remote sensing image analysis workflows. Most of the traditional rule-based and machine-learning-based algorithms utilize low-level features of the clouds and classify individual cloud pixels based on their spectral signatures. Cloud detection using such approaches can be challenging due to a multitude of factors including harsh lighting conditions, the presence of thin clouds, the context of surrounding pixels, and complex spatial patterns. In recent studies, deep convolutional neural networks (CNNs) have shown outstanding results in the computer vision domain. These methods are practiced for better capturing the texture, shape as well as context of images. In this study, we propose a deep learning CNN approach to detect cloud pixels from medium-resolution satellite imagery. The proposed CNN accounts for both the low-level features, such as color and texture information as well as high-level features extracted from successive convolutions of the input image. We prepared a cloud-pixel dataset of approximately 7273 randomly sampled 320 by 320 pixels image patches taken from a total of 121 Landsat-8 (30m) and Sentinel-2 (20m) image scenes. These satellite images come with cloud masks. From the available data channels, only blue, green, red, and NIR bands are fed into the model. The CNN model was trained on 5300 image patches and validated on 1973 independent image patches. As the final output from our model, we extract a binary mask of cloud pixels and non-cloud pixels. The results are benchmarked against established cloud detection methods using standard accuracy metrics.

    more » « less
  4. Flooding is one of the leading threats of natural disasters to human life and property, especially in densely populated urban areas. Rapid and precise extraction of the flooded areas is key to supporting emergency-response planning and providing damage assessment in both spatial and temporal measurements. Unmanned Aerial Vehicles (UAV) technology has recently been recognized as an efficient photogrammetry data acquisition platform to quickly deliver high-resolution imagery because of its cost-effectiveness, ability to fly at lower altitudes, and ability to enter a hazardous area. Different image classification methods including SVM (Support Vector Machine) have been used for flood extent mapping. In recent years, there has been a significant improvement in remote sensing image classification using Convolutional Neural Networks (CNNs). CNNs have demonstrated excellent performance on various tasks including image classification, feature extraction, and segmentation. CNNs can learn features automatically from large datasets through the organization of multi-layers of neurons and have the ability to implement nonlinear decision functions. This study investigates the potential of CNN approaches to extract flooded areas from UAV imagery. A VGG-based fully convolutional network (FCN-16s) was used in this research. The model was fine-tuned and a k-fold cross-validation was applied to estimate the performance of the model on the new UAV imagery dataset. This approach allowed FCN-16s to be trained on the datasets that contained only one hundred training samples, and resulted in a highly accurate classification. Confusion matrix was calculated to estimate the accuracy of the proposed method. The image segmentation results obtained from FCN-16s were compared from the results obtained from FCN-8s, FCN-32s and SVMs. Experimental results showed that the FCNs could extract flooded areas precisely from UAV images compared to the traditional classifiers such as SVMs. The classification accuracy achieved by FCN-16s, FCN-8s, FCN-32s, and SVM for the water class was 97.52%, 97.8%, 94.20% and 89%, respectively. 
    more » « less
  5. Convolutional neural networks (CNNs) are becoming an increasingly popular approach for classification mapping of large complex regions where manual data collection is too time consuming. Stream boundaries in hyper-arid polar regions such as the McMurdo Dry Valleys (MDVs) in Antarctica are difficult to locate because they have little hydraulic flow throughout the short summer months. This paper utilizes a U-Net CNN to map stream boundaries from lidar derived rasters in Taylor Valley located within the MDVs, covering ∼770 km2. The training dataset consists of 217 (300 × 300 m2) well-distributed tiles of manually classified stream boundaries with diverse geometries (straight, sinuous, meandering, and braided) throughout the valley. The U-Net CNN is trained on elevation, slope, lidar intensity returns, and flow accumulation rasters. These features were used for detection of stream boundaries by providing potential topographic cues such as inflection points at stream boundaries and reflective properties of streams such as linear patterns of wetted soil, water, or ice. Various combinations of these features were analyzed based on performance. The test set performance revealed that elevation and slope had the highest performance of the feature combinations. The test set performance analysis revealed that the CNN model trained with elevation independently received a precision, recall, and F1 score of 0.94±0.05, 0.95±0.04, and 0.94±0.04 respectively, while slope received 0.96±0.03, 0.93±0.04, and 0.94±0.04, respectively. The performance of the test set revealed higher stream boundary prediction accuracies along the coast, while inland performance varied. Meandering streams had the highest stream boundary prediction performance on the test set compared to the other stream geometries tested here because meandering streams are further evolved and have more distinguishable breaks in slope, indicating stream boundaries. These methods provide a novel approach for mapping stream boundaries semi-automatically in complex regions such as hyper-arid environments over larger scales than is possible for current methods. 
    more » « less