skip to main content


Title: Deep learning network with differentiable dynamic programming for retina OCT surface segmentation

Multiple-surface segmentation in optical coherence tomography (OCT) images is a challenging problem, further complicated by the frequent presence of weak image boundaries. Recently, many deep learning-based methods have been developed for this task and yield remarkable performance. Unfortunately, due to the scarcity of training data in medical imaging, it is challenging for deep learning networks to learn the global structure of the target surfaces, including surface smoothness. To bridge this gap, this study proposes to seamlessly unify a U-Net for feature learning with a constrained differentiable dynamic programming module to achieve end-to-end learning for retina OCT surface segmentation to explicitly enforce surface smoothness. It effectively utilizes the feedback from the downstream model optimization module to guide feature learning, yielding better enforcement of global structures of the target surfaces. Experiments on Duke AMD (age-related macular degeneration) and JHU MS (multiple sclerosis) OCT data sets for retinal layer segmentation demonstrated that the proposed method was able to achieve subvoxel accuracy on both datasets, with the mean absolute surface distance (MASD) errors of 1.88 ± 1.96μmand 2.75 ± 0.94μm, respectively, over all the segmented surfaces.

 
more » « less
Award ID(s):
2000425
NSF-PAR ID:
10421067
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Optical Society of America
Date Published:
Journal Name:
Biomedical Optics Express
Volume:
14
Issue:
7
ISSN:
2156-7085
Page Range / eLocation ID:
Article No. 3190
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Segmentation of multiple surfaces in optical coherence tomography (OCT) images is a challenging problem, further complicated by the frequent presence of weak boundaries, varying layer thicknesses, and mutual influence between adjacent surfaces. The traditional graph-based optimal surface segmentation method has proven its effectiveness with its ability to capture various surface priors in a uniform graph model. However, its efficacy heavily relies on handcrafted features that are used to define the surface cost for the “goodness” of a surface. Recently, deep learning (DL) is emerging as a powerful tool for medical image segmentation thanks to its superior feature learning capability. Unfortunately, due to the scarcity of training data in medical imaging, it is nontrivial for DL networks to implicitly learn the global structure of the target surfaces, including surface interactions. This study proposes to parameterize the surface cost functions in the graph model and leverage DL to learn those parameters. The multiple optimal surfaces are then simultaneously detected by minimizing the total surface cost while explicitly enforcing the mutual surface interaction constraints. The optimization problem is solved by the primal-dual interior-point method (IPM), which can be implemented by a layer of neural networks, enabling efficient end-to-end training of the whole network. Experiments on spectral-domain optical coherence tomography (SD-OCT) retinal layer segmentation demonstrated promising segmentation results with sub-pixel accuracy. 
    more » « less
  2. Optimal surface segmentation is a state-of-the-art method used for segmentation of multiple globally optimal surfaces in volumetric datasets. The method is widely used in numerous medical image segmentation applications. However, nodes in the graph based optimal surface segmentation method typically encode uniformly distributed orthogonal voxels of the volume. Thus the segmentation cannot attain an accuracy greater than a single unit voxel, i.e. the distance between two adjoining nodes in graph space. Segmentation accuracy higher than a unit voxel is achievable by exploiting partial volume information in the voxels which shall result in non-equidistant spacing between adjoining graph nodes. This paper reports a generalized graph based multiple surface segmentation method with convex priors which can optimally segment the target surfaces in an irregularly sampled space. The proposed method allows non-equidistant spacing between the adjoining graph nodes to achieve subvoxel segmentation accuracy by utilizing the partial volume information in the voxels. The partial volume information in the voxels is exploited by computing a displacement field from the original volume data to identify the subvoxel-accurate centers within each voxel resulting in non-equidistant spacing between the adjoining graph nodes. The smoothness of each surface modeled as a convex constraint governs the connectivity and regularity of the surface. We employ an edge-based graph representation to incorporate the necessary constraints and the globally optimal solution is obtained by computing a minimum s-t cut. The proposed method was validated on 10 intravascular multi-frame ultrasound image datasets for subvoxel segmentation accuracy. In all cases, the approach yielded highly accurate results. Our approach can be readily extended to higher-dimensional segmentations. 
    more » « less
  3. Deep learning has achieved remarkable results in 3D shape analysis by learning global shape features from the pixel-level over multiple views. Previous methods, however, compute low-level features for entire views without considering part-level information. In contrast, we propose a deep neural network, called Parts4Feature, to learn 3D global features from part-level information in multiple views. We introduce a novel definition of generally semantic parts, which Parts4Feature learns to detect in multiple views from different 3D shape segmentation benchmarks. A key idea of our architecture is that it transfers the ability to detect semantically meaningful parts in multiple views to learn 3D global features. Parts4Feature achieves this by combining a local part detection branch and a global feature learning branch with a shared region proposal module. The global feature learning branch aggregates the detected parts in terms of learned part patterns with a novel multi-attention mechanism, while the region proposal module enables locally and globally discriminative information to be promoted by each other. We demonstrate that Parts4Feature outperforms the state-of-the-art under three large-scale 3D shape benchmarks.

     
    more » « less
  4. Medical image segmentation is one of the most challenging tasks in medical image analysis and widely developed for many clinical applications. While deep learning-based approaches have achieved impressive performance in semantic segmentation, they are limited to pixel-wise settings with imbalanced-class data problems and weak boundary object segmentation in medical images. In this paper, we tackle those limitations by developing a new two-branch deep network architecture which takes both higher level features and lower level features into account. The first branch extracts higher level feature as region information by a common encoder-decoder network structure such as Unet and FCN, whereas the second branch focuses on lower level features as support information around the boundary and processes in parallel to the first branch. Our key contribution is the second branch named Narrow Band Active Contour (NB-AC) attention model which treats the object contour as a hyperplane and all data inside a narrow band as support information that influences the position and orientation of the hyperplane. Our proposed NB-AC attention model incorporates the contour length with the region energy involving a fixed-width band around the curve or surface. The proposed network loss contains two fitting terms: (i) a high level feature (i.e., region) fitting term from the first branch; (ii) a lower level feature (i.e., contour) fitting term from the second branch including the (ii1) length of the object contour and (ii2) regional energy functional formed by the homogeneity criterion of both the inner band and outer band neighboring the evolving curve or surface. The proposed NB-AC loss can be incorporated into both 2D and 3D deep network architectures. The proposed network has been evaluated on different challenging medical image datasets, including DRIVE, iSeg17, MRBrainS18 and Brats18. The experimental results have shown that the proposed NB-AC loss outperforms other mainstream loss functions: Cross Entropy, Dice, Focal on two common segmentation frameworks Unet and FCN. Our 3D network which is built upon the proposed NB-AC loss and 3DUnet framework achieved state-of-the-art results on multiple volumetric datasets. 
    more » « less
  5. Abstract

    Many studies of Earth surface processes and landscape evolution rely on having accurate and extensive data sets of surficial geologic units and landforms. Automated extraction of geomorphic features using deep learning provides an objective way to consistently map landforms over large spatial extents. However, there is no consensus on the optimal input feature space for such analyses. We explore the impact of input feature space for extracting geomorphic features from land surface parameters (LSPs) derived from digital terrain models (DTMs) using convolutional neural network (CNN)‐based semantic segmentation deep learning. We compare four input feature space configurations: (a) a three‐layer composite consisting of a topographic position index (TPI) calculated using a 50 m radius circular window, square root of topographic slope, and TPI calculated using an annulus with a 2 m inner radius and 10 m outer radius, (b) a single illuminating position hillshade, (c) a multidirectional hillshade, and (d) a slopeshade. We test each feature space input using three deep learning algorithms and four use cases: two with natural features and two with anthropogenic features. The three‐layer composite generally provided lower overall losses for the training samples, a higher F1‐score for the withheld validation data, and better performance for generalizing to withheld testing data from a new geographic extent. Results suggest that CNN‐based deep learning for mapping geomorphic features or landforms from LSPs is sensitive to input feature space. Given the large number of LSPs that can be derived from DTM data and the variety of geomorphic mapping tasks that can be undertaken using CNN‐based methods, we argue that additional research focused on feature space considerations is needed and suggest future research directions. We also suggest that the three‐layer composite implemented here can offer better performance in comparison to using hillshades or other common terrain visualization surfaces and is, thus, worth considering for different mapping and feature extraction tasks.

     
    more » « less