skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: 3D bi-directional transformer U-Net for medical image segmentation
As one of the popular deep learning methods, deep convolutional neural networks (DCNNs) have been widely adopted in segmentation tasks and have received positive feedback. However, in segmentation tasks, DCNN-based frameworks are known for their incompetence in dealing with global relations within imaging features. Although several techniques have been proposed to enhance the global reasoning of DCNN, these models are either not able to gain satisfying performances compared with traditional fully-convolutional structures or not capable of utilizing the basic advantages of CNN-based networks (namely the ability of local reasoning). In this study, compared with current attempts to combine FCNs and global reasoning methods, we fully extracted the ability of self-attention by designing a novel attention mechanism for 3D computation and proposed a new segmentation framework (named 3DTU) for three-dimensional medical image segmentation tasks. This new framework processes images in an end-to-end manner and executes 3D computation on both the encoder side (which contains a 3D transformer) and the decoder side (which is based on a 3D DCNN). We tested our framework on two independent datasets that consist of 3D MRI and CT images. Experimental results clearly demonstrate that our method outperforms several state-of-the-art segmentation methods in various metrics.  more » « less
Award ID(s):
2045848
PAR ID:
10390237
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Frontiers in Big Data
Volume:
5
ISSN:
2624-909X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Collecting large-scale medical datasets with fully annotated samples for training of deep networks is prohibitively expensive, especially for 3D volume data. Recent breakthroughs in self-supervised learning (SSL) offer the ability to overcome the lack of labeled training samples by learning feature representations from unlabeled data. However, most current SSL techniques in the medical field have been designed for either 2D images or 3D volumes. In practice, this restricts the capability to fully leverage unlabeled data from numerous sources, which may include both 2D and 3D data. Additionally, the use of these pre-trained networks is constrained to downstream tasks with compatible data dimensions.In this paper, we propose a novel framework for unsupervised joint learning on 2D and 3D data modalities. Given a set of 2D images or 2D slices extracted from 3D volumes, we construct an SSL task based on a 2D contrastive clustering problem for distinct classes. The 3D volumes are exploited by computing vectored embedding at each slice and then assembling a holistic feature through deformable self-attention mechanisms in Transformer, allowing incorporating long-range dependencies between slices inside 3D volumes. These holistic features are further utilized to define a novel 3D clustering agreement-based SSL task and masking embedding prediction inspired by pre-trained language models. Experiments on downstream tasks, such as 3D brain segmentation, lung nodule detection, 3D heart structures segmentation, and abnormal chest X-ray detection, demonstrate the effectiveness of our joint 2D and 3D SSL approach. We improve plain 2D Deep-ClusterV2 and SwAV by a significant margin and also surpass various modern 2D and 3D SSL approaches. 
    more » « less
  2. Interstitial lung disease (ILD) causes pulmonary fibrosis. The correct classification of ILD plays a crucial role in the diagnosis and treatment process. In this research work, we propose a lung nodules recognition method based on a deep convolutional neural network (DCNN) and global features, which can be used for computer-aided diagnosis (CAD) of global features of lung nodules. Firstly, a DCNN is constructed based on the characteristics and complexity of lung computerized tomography (CT) images. Then we discussed the effects of different iterations on the recognition results and influence of different model structures on the global features of lung nodules. We also incorporated the improvement of convolution kernel size, feature dimension, and network depth. Thirdly, the effects of different pooling methods, activation functions and training algorithms we proposed has been analyzed to demonstrate the advantages of the new strategy. Finally, the experimental results verify the feasibility of the proposed DCNN for CAD of global features of lung nodules, and the evaluation shown that our proposed method could achieve an outstanding results compare to state-of-arts. 
    more » « less
  3. null (Ed.)
    Accurate segmentation and parameterization of the iris in eye images still remain a significant challenge for achieving robust iris recognition, especially in off‐angle images captured in less constrained environments. While deep learning techniques (i.e. segmentation‐based convolutional neural networks (CNNs)) are increasingly being used to address this problem, there is a significant lack of information about the mechanism of the related distortions affecting the performance of these networks and no comprehensive recognition framework is dedicated, in particular, to off‐angle iris recognition using such modules. In this work, the general effect of different gaze angles on ocular biometrics is discussed, and the findings are then related to the CNN‐based off‐angle iris segmentation results and the subsequent recognition performance. An improvement scheme is also introduced to compensate for some segmentation degradations caused by the off‐angle distortions, and a new gaze‐angle estimation and parameterization module is further proposed to estimate and re‐project (correct) the offangle iris images back to frontal view. Taking benefit of these, several approaches (pipelines) are formulated to configure an end‐to‐end framework for the CNN‐based offangle iris segmentation and recognition. Within the framework of these approaches, a series of experiments is carried out to determine whether (i) improving the segmentation outputs and/or correcting the output iris images before or after the segmentation can compensate for some off‐angle distortions, (ii) a CNN trained on frontal eye images is capable of detecting and extracting the learnt features on the corrected images, or (iii) the generalisation capability of the network can be improved by training it on iris images of different gaze angles. Finally, the recognition performance of the selected approach is compared against some state‐of‐the‐art off‐angle iris recognition algorithms. 
    more » « less
  4. Deep learning methods have achieved impressive performance for multi-class medical image segmentation. However, they are limited in their ability to encode topological interactions among different classes (e.g., containment and exclusion). These constraints naturally arise in biomedical images and can be crucial in improving segmentation quality. In this paper, we introduce a novel topological interaction module to encode the topological interactions into a deep neural network. The implementation is completely convolution-based and thus can be very efficient. This empowers us to incorporate the constraints into end-to-end training and enrich the feature representation of neural networks. The efficacy of the proposed method is validated on different types of interactions. We also demonstrate the generalizability of the method on both proprietary and public challenge datasets, in both 2D and 3D settings, as well as across different modalities such as CT and Ultrasound. Code is available at: https://github.com/TopoXLab/TopoInteraction. 
    more » « less
  5. In this paper, we propose a new Automatic Target Recognition (ATR) system, based on Deep Convolutional Neural Network (DCNN), to detect the targets in Forward Looking Infrared (FLIR) scenes and recognize their classes. In our proposed ATR framework, a fully convolutional network (FCN) is trained to map the input FLIR imagery data to a fixed stride correspondingly-sized target score map. The potential targets are identified by applying a threshold on the target score map. Finally, corresponding regions centered at these target points are fed to a DCNN to classify them into different target types while at the same time rejecting the false alarms. The proposed architecture achieves a significantly better performance in comparison with that of the state-of-the-art methods on two large FLIR image databases. 
    more » « less