skip to main content

Title: A Multi-task Contextual Atrous Residual Network for Brain Tumor Detection & Segmentation
In recent years, deep neural networks have achieved state-of-the-art performance in a variety of recognition and segmentation tasks in medical imaging including brain tumor segmentation. We investigate that segmenting a brain tumor is facing to the imbalanced data problem where the number of pixels belonging to the background class (non tumor pixel) is much larger than the number of pixels belonging to the foreground class (tumor pixel). To address this problem, we propose a multitask network which is formed as a cascaded structure. Our model consists of two targets, i.e., (i) effectively differentiate the brain tumor regions and (ii) estimate the brain tumor mask. The first objective is performed by our proposed contextual brain tumor detection network, which plays a role of an attention gate and focuses on the region around brain tumor only while ignoring the far neighbor background which is less correlated to the tumor. Different from other existing object detection networks which process every pixel, our contextual brain tumor detection network only processes contextual regions around ground-truth instances and this strategy aims at producing meaningful regions proposals. The second objective is built upon a 3D atrous residual network and under an encode-decode network in order to effectively segment both large and small objects (brain tumor). Our 3D atrous residual network is designed with a skip connection to enables the gradient more » from the deep layers to be directly propagated to shallow layers, thus, features of different depths are preserved and used for refining each other. In order to incorporate larger contextual information from volume MRI data, our network utilizes the 3D atrous convolution with various kernel sizes, which enlarges the receptive field of filters. Our proposed network has been evaluated on various datasets including BRATS2015, BRATS2017 and BRATS2018 datasets with both validation set and testing set. Our performance has been benchmarked by both regionbased metrics and surface-based metrics. We also have conducted comparisons against state-of-the-art approaches « less
; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
2020 25th International Conference on Pattern Recognition (ICPR)
Sponsoring Org:
National Science Foundation
More Like this
  1. Medical image segmentation has played an important role in medical analysis and widely developed for many clinical applications. Deep learning-based approaches have achieved high performance in semantic segmentation but they are limited to pixel-wise setting and imbalanced classes data problem. In this paper, we tackle those limitations by developing a new deep learning-based model which takes into account both higher feature level i.e. region inside contour, intermediate feature level i.e. offset curves around the contour and lower feature level i.e. contour. Our proposed Offset Curves (OsC) loss consists of three main fitting terms. The first fitting term focuses on pixel-wise level segmentation whereas the second fitting term acts as attention model which pays attention to the area around the boundaries (offset curves). The third terms plays a role as regularization term which takes the length of boundaries into account. We evaluate our proposed OsC loss on both 2D network and 3D network. Two common medical datasets, i.e. retina DRIVE and brain tumor BRATS 2018 datasets are used to benchmark our proposed loss performance. The experiments have shown that our proposed OsC loss function outperforms other mainstream loss functions such as Cross-Entropy, Dice, Focal on the most common segmentation networks Unet,more »FCN.« less
  2. Medical image segmentation is one of the most challenging tasks in medical image analysis and widely developed for many clinical applications. While deep learning-based approaches have achieved impressive performance in semantic segmentation, they are limited to pixel-wise settings with imbalanced-class data problems and weak boundary object segmentation in medical images. In this paper, we tackle those limitations by developing a new two-branch deep network architecture which takes both higher level features and lower level features into account. The first branch extracts higher level feature as region information by a common encoder-decoder network structure such as Unet and FCN, whereas the second branch focuses on lower level features as support information around the boundary and processes in parallel to the first branch. Our key contribution is the second branch named Narrow Band Active Contour (NB-AC) attention model which treats the object contour as a hyperplane and all data inside a narrow band as support information that influences the position and orientation of the hyperplane. Our proposed NB-AC attention model incorporates the contour length with the region energy involving a fixed-width band around the curve or surface. The proposed network loss contains two fitting terms: (i) a high level feature (i.e., region)more »fitting term from the first branch; (ii) a lower level feature (i.e., contour) fitting term from the second branch including the (ii1) length of the object contour and (ii2) regional energy functional formed by the homogeneity criterion of both the inner band and outer band neighboring the evolving curve or surface. The proposed NB-AC loss can be incorporated into both 2D and 3D deep network architectures. The proposed network has been evaluated on different challenging medical image datasets, including DRIVE, iSeg17, MRBrainS18 and Brats18. The experimental results have shown that the proposed NB-AC loss outperforms other mainstream loss functions: Cross Entropy, Dice, Focal on two common segmentation frameworks Unet and FCN. Our 3D network which is built upon the proposed NB-AC loss and 3DUnet framework achieved state-of-the-art results on multiple volumetric datasets.« less
  3. We propose UniPose, a unified framework for human pose estimation, based on our “Waterfall” Atrous Spatial Pooling architecture, that achieves state-of-art-results on several pose estimation metrics. Current pose estimation methods utilizing standard CNN architectures heavily rely on statistical postprocessing or predefined anchor poses for joint localization. UniPose incorporates contextual segmentation and joint localization to estimate the human pose in a single stage, with high accuracy, without relying on statistical postprocessing methods. The Waterfall module in UniPose leverages the efficiency of progressive filtering in the cascade architecture, while maintaining multiscale fields-of-view comparable to spatial pyramid configurations. Additionally, our method is extended to UniPoseLSTM for multi-frame processing and achieves state-of-theart results for temporal pose estimation in Video. Our results on multiple datasets demonstrate that UniPose, with a ResNet backbone and Waterfall module, is a robust and efficient architecture for pose estimation obtaining state-ofthe-art results in single person pose detection for both single images and videos
  4. Introduction: Vaso-occlusive crises (VOCs) are a leading cause of morbidity and early mortality in individuals with sickle cell disease (SCD). These crises are triggered by sickle red blood cell (sRBC) aggregation in blood vessels and are influenced by factors such as enhanced sRBC and white blood cell (WBC) adhesion to inflamed endothelium. Advances in microfluidic biomarker assays (i.e., SCD Biochip systems) have led to clinical studies of blood cell adhesion onto endothelial proteins, including, fibronectin, laminin, P-selectin, ICAM-1, functionalized in microchannels. These microfluidic assays allow mimicking the physiological aspects of human microvasculature and help characterize biomechanical properties of adhered sRBCs under flow. However, analysis of the microfluidic biomarker assay data has so far relied on manual cell counting and exhaustive visual morphological characterization of cells by trained personnel. Integrating deep learning algorithms with microscopic imaging of adhesion protein functionalized microfluidic channels can accelerate and standardize accurate classification of blood cells in microfluidic biomarker assays. Here we present a deep learning approach into a general-purpose analytical tool covering a wide range of conditions: channels functionalized with different proteins (laminin or P-selectin), with varying degrees of adhesion by both sRBCs and WBCs, and in both normoxic and hypoxic environments. Methods: Our neuralmore »networks were trained on a repository of manually labeled SCD Biochip microfluidic biomarker assay whole channel images. Each channel contained adhered cells pertaining to clinical whole blood under constant shear stress of 0.1 Pa, mimicking physiological levels in post-capillary venules. The machine learning (ML) framework consists of two phases: Phase I segments pixels belonging to blood cells adhered to the microfluidic channel surface, while Phase II associates pixel clusters with specific cell types (sRBCs or WBCs). Phase I is implemented through an ensemble of seven generative fully convolutional neural networks, and Phase II is an ensemble of five neural networks based on a Resnet50 backbone. Each pixel cluster is given a probability of belonging to one of three classes: adhered sRBC, adhered WBC, or non-adhered / other. Results and Discussion: We applied our trained ML framework to 107 novel whole channel images not used during training and compared the results against counts from human experts. As seen in Fig. 1A, there was excellent agreement in counts across all protein and cell types investigated: sRBCs adhered to laminin, sRBCs adhered to P-selectin, and WBCs adhered to P-selectin. Not only was the approach able to handle surfaces functionalized with different proteins, but it also performed well for high cell density images (up to 5000 cells per image) in both normoxic and hypoxic conditions (Fig. 1B). The average uncertainty for the ML counts, obtained from accuracy metrics on the test dataset, was 3%. This uncertainty is a significant improvement on the 20% average uncertainty of the human counts, estimated from the variance in repeated manual analyses of the images. Moreover, manual classification of each image may take up to 2 hours, versus about 6 minutes per image for the ML analysis. Thus, ML provides greater consistency in the classification at a fraction of the processing time. To assess which features the network used to distinguish adhered cells, we generated class activation maps (Fig. 1C-E). These heat maps indicate the regions of focus for the algorithm in making each classification decision. Intriguingly, the highlighted features were similar to those used by human experts: the dimple in partially sickled RBCs, the sharp endpoints for highly sickled RBCs, and the uniform curvature of the WBCs. Overall the robust performance of the ML approach in our study sets the stage for generalizing it to other endothelial proteins and experimental conditions, a first step toward a universal microfluidic ML framework targeting blood disorders. Such a framework would not only be able to integrate advanced biophysical characterization into fast, point-of-care diagnostic devices, but also provide a standardized and reliable way of monitoring patients undergoing targeted therapies and curative interventions, including, stem cell and gene-based therapies for SCD. Disclosures Gurkan: Dx Now Inc.: Patents & Royalties; Xatek Inc.: Patents & Royalties; BioChip Labs: Patents & Royalties; Hemex Health, Inc.: Consultancy, Current Employment, Patents & Royalties, Research Funding.« less
  5. Detecting objects in aerial images is challenging for at least two reasons: (1) target objects like pedestrians are very small in pixels, making them hardly distinguished from surrounding background; and (2) targets are in general sparsely and non-uniformly distributed, making the detection very inefficient. In this paper, we address both issues inspired by observing that these targets are often clustered. In particular, we propose a Clustered Detection (ClusDet) network that unifies object clustering and detection in an end-to-end framework. The key components in ClusDet include a cluster proposal sub-network (CPNet), a scale estimation sub-network (ScaleNet), and a dedicated detection network (DetecNet). Given an input image, CPNet produces object cluster regions and ScaleNet estimates object scales for these regions. Then, each scale-normalized cluster region and their features are fed into DetecNet for object detection. ClusDet has several advantages over previous solutions: (1) it greatly reduces the number of chips for final object detection and hence achieves high running time efficiency, (2) the cluster-based scale estimation is more accurate than previously used single-object based ones, hence effectively improves the detection for small objects, and (3) the final DetecNet is dedicated for clustered regions and implicitly models the prior context information so asmore »to boost detection accuracy. The proposed method is tested on three popular aerial image datasets including VisDrone, UAVDT and DOTA. In all experiments, ClusDet achieves promising performance in comparison with state-of-the-art detectors« less