Deep learning algorithms have been successfully adopted to extract meaningful information from digital images, yet many of them have been untapped in the semantic image segmentation of histopathology images. In this paper, we propose a deep convolutional neural network model that strengthens Atrous separable convolutions with a high rate within spatial pyramid pooling for histopathology image segmentation. A well-known model called DeepLabV3Plus was used for the encoder and decoder process. ResNet50 was adopted for the encoder block of the model which provides us the advantage of attenuating the problem of the increased depth of the network by using skip connections. Three Atrous separable convolutions with higher rates were added to the existing Atrous separable convolutions. We conducted a performance evaluation on three tissue types: tumor, tumor-infiltrating lymphocytes, and stroma for comparing the proposed model with the eight state-of-the-art deep learning models: DeepLabV3, DeepLabV3Plus, LinkNet, MANet, PAN, PSPnet, UNet, and UNet++. The performance results show that the proposed model outperforms the eight models on mIOU (0.8058/0.7792) and FSCR (0.8525/0.8328) for both tumor and tumor-infiltrating lymphocytes. 
                        more » 
                        « less   
                    
                            
                            SN-FPN: Self-Attention Nested Feature Pyramid Network for Digital Pathology Image Segmentation
                        
                    
    
            Digital pathology has played a key role in replacing glass slides with digital images, enhancing various pathology workflows. Whole slide images are digitized pathological images improving the capabilities of digital pathology and contributing to the overall turnaround time for diagnoses. The digitized images have been successfully integrated with artificial intelligence algorithms assisting pathologists in many tasks, but there are still demands to develop a new algorithm for a better diagnosis process. In this paper, we propose a new deep convolutional neural network model integrating a feature pyramid network with a self-attention mechanism in three pathways: encoder, decoder, and self-attention nested for providing accurate tumor region segmentation on whole slide images. The encoder pathway adopts ResNet50 architecture for the bottom-up network. The decoder pathway adopts the feature pyramid network for the top-down network. The self-attention nested pathway forms the attention map represented by the distribution of attention scores focusing on localizing tumor regions and avoiding irrelevant information. The results of our experiment show that the proposed model outperforms the state-of-the-art deep convolutional neural network models in terms of tumor and stromal region segmentation. Moreover, various encoder networks were equipped with the proposed model and compared with each other. The results indicate that the ResNet series using the proposed model outperforms other encoder networks. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2409705
- PAR ID:
- 10618593
- Publisher / Repository:
- IEEE
- Date Published:
- Journal Name:
- IEEE Access
- Volume:
- 12
- ISSN:
- 2169-3536
- Page Range / eLocation ID:
- 92764 to 92773
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            In this paper, we are the first to propose a new graph convolution-based decoder namely, Cascaded Graph Convolutional Attention Decoder (G-CASCADE), for 2D medical image segmentation. G-CASCADE progressively refines multi-stage feature maps generated by hierarchical transformer encoders with an efficient graph convolution block. The encoder utilizes the self-attention mechanism to capture long-range dependencies, while the decoder refines the feature maps preserving long-range information due to the global receptive fields of the graph convolution block. Rigorous evaluations of our decoder with multiple transformer encoders on five medical image segmentation tasks (i.e., Abdomen organs, Cardiac organs, Polyp lesions, Skin lesions, and Retinal vessels) show that our model outperforms other state-of-the-art (SOTA) methods. We also demonstrate that our decoder achieves better DICE scores than the SOTA CASCADE decoder with 80.8% fewer parameters and 82.3% fewer FLOPs. Our decoder can easily be used with other hierarchical encoders for general-purpose semantic and medical image segmentation tasks.more » « less
- 
            The segment anything model (SAM) was released as a foundation model for image segmentation. The promptable segmentation model was trained by over 1 billion masks on 11M licensed and privacy-respecting images. The model supports zero-shot image segmentation with various seg- mentation prompts (e.g., points, boxes, masks). It makes the SAM attractive for medical image analysis, especially for digital pathology where the training data are rare. In this study, we eval- uate the zero-shot segmentation performance of SAM model on representative segmentation tasks on whole slide imaging (WSI), including (1) tumor segmentation, (2) non-tumor tissue segmen- tation, (3) cell nuclei segmentation. Core Results: The results suggest that the zero-shot SAM model achieves remarkable segmentation performance for large connected objects. However, it does not consistently achieve satisfying performance for dense instance object segmentation, even with 20 prompts (clicks/boxes) on each image. We also summarized the identified limitations for digital pathology: (1) image resolution, (2) multiple scales, (3) prompt selection, and (4) model fine-tuning. In the future, the few-shot fine-tuning with images from downstream pathological seg- mentation tasks might help the model to achieve better performance in dense object segmentation.more » « less
- 
            As one of the popular deep learning methods, deep convolutional neural networks (DCNNs) have been widely adopted in segmentation tasks and have received positive feedback. However, in segmentation tasks, DCNN-based frameworks are known for their incompetence in dealing with global relations within imaging features. Although several techniques have been proposed to enhance the global reasoning of DCNN, these models are either not able to gain satisfying performances compared with traditional fully-convolutional structures or not capable of utilizing the basic advantages of CNN-based networks (namely the ability of local reasoning). In this study, compared with current attempts to combine FCNs and global reasoning methods, we fully extracted the ability of self-attention by designing a novel attention mechanism for 3D computation and proposed a new segmentation framework (named 3DTU) for three-dimensional medical image segmentation tasks. This new framework processes images in an end-to-end manner and executes 3D computation on both the encoder side (which contains a 3D transformer) and the decoder side (which is based on a 3D DCNN). We tested our framework on two independent datasets that consist of 3D MRI and CT images. Experimental results clearly demonstrate that our method outperforms several state-of-the-art segmentation methods in various metrics.more » « less
- 
            null (Ed.)Convolutional Neural Network (CNN) based image segmentation has made great progress in recent years. However, video object segmentation remains a challenging task due to its high computational complexity. Most of the previous methods employ a two-stream CNN framework to handle spatial and motion features separately. In this paper, we propose an end-to-end encoder-decoder style 3D CNN to aggregate spatial and temporal information simultaneously for video object segmentation. To efficiently process video, we propose 3D separable convolution for the pyramid pooling module and decoder, which dramatically reduces the number of operations while maintaining the performance. Moreover, we also extend our framework to video action segmentation by adding an extra classifier to predict the action label for actors in videos. Extensive experiments on several video datasets demonstrate the superior performance of the proposed approach for action and object segmentation compared to the state-of-the-art.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    