Contemporary approaches to instance segmentation in cell science use 2D or 3D convolutional networks depending on the experiment and data structures. However, limitations in microscopy systems or efforts to prevent phototoxicity commonly require recording sub-optimally sampled data that greatly reduces the utility of such 3D data, especially in crowded sample space with significant axial overlap between objects. In such regimes, 2D segmentations are both more reliable for cell morphology and easier to annotate. In this work, we propose the projection enhancement network (PEN), a novel convolutional module which processes the sub-sampled 3D data and produces a 2D RGB semantic compression, and is trained in conjunction with an instance segmentation network of choice to produce 2D segmentations. Our approach combines augmentation to increase cell density using a low-density cell image dataset to train PEN, and curated datasets to evaluate PEN. We show that with PEN, the learned semantic representation in CellPose encodes depth and greatly improves segmentation performance in comparison to maximum intensity projection images as input, but does not similarly aid segmentation in region-based networks like Mask-RCNN. Finally, we dissect the segmentation strength against cell density of PEN with CellPose on disseminated cells from side-by-side spheroids. We present PEN as a data-driven solution to form compressed representations of 3D data that improve 2D segmentations from instance segmentation networks.
more » « less- Award ID(s):
- 1844627
- PAR ID:
- 10512469
- Publisher / Repository:
- IOPscience
- Date Published:
- Journal Name:
- Physical Biology
- Volume:
- 20
- Issue:
- 6
- ISSN:
- 1478-3967
- Page Range / eLocation ID:
- 066003
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Monocular 3D object parsing is highly desirable in various scenarios including occlusion reasoning and holistic scene interpretation. We present a deep convolutional neural network (CNN) architecture to localize semantic parts in 2D image and 3D space while inferring their visibility states, given a single RGB image. Our key insight is to exploit domain knowledge to regularize the network by deeply supervising its hidden layers, in order to sequentially infer intermediate concepts associated with the final task. To acquire training data in desired quantities with ground truth 3D shape and relevant concepts, we render 3D object CAD models to generate large-scale synthetic data and simulate challenging occlusion configurations between objects. We train the network only on synthetic data and demonstrate state-of-the-art performances on real image benchmarks including an extended version of KITTI, PASCAL VOC, PASCAL3D+ and IKEA for 2D and 3D keypoint localization and instance segmentation. The empirical results substantiate the utility of our deep supervision scheme by demonstrating effective transfer of knowledge from synthetic data to real images, resulting in less overfitting compared to standard end-to-end training.more » « less
-
Unlike images which are represented in regular dense grids, 3D point clouds are irregular and unordered, hence applying convolution on them can be difficult. In this paper, we extend the dynamic filter to a new convolution operation, named PointConv. PointConv can be applied on point clouds to build deep convolutional networks. We treat convolution kernels as nonlinear functions of the local coordinates of 3D points comprised of weight and density functions. With respect to a given point, the weight functions are learned with multi-layer perceptron networks and the density functions through kernel density estimation. A novel reformulation is proposed for efficiently computing the weight functions, which allowed us to dramatically scale up the network and significantly improve its performance. The learned convolution kernel can be used to compute translation-invariant and permutation-invariant convolution on any point set in the 3D space. Besides, PointConv can also be used as deconvolution operators to propagate features from a subsampled point cloud back to its original resolution. Experiments on ModelNet40, ShapeNet, and ScanNet show that deep convolutional neural networks built on PointConv are able to achieve state-of-the-art on challenging semantic segmentation benchmarks on 3D point clouds. Besides, our experiments converting CIFAR-10 into a point cloud showed that networks built on PointConv can match the performance of convolutional networks in 2D images of a similar structure.more » « less
-
We present SHRED, a method for 3D SHape REgion Decomposition. SHRED takes a 3D point cloud as input and uses learned local operations to produce a segmentation that approximates fine-grained part instances. We endow SHRED with three decomposition operations: splitting regions, fixing the boundaries between regions, and merging regions together. Modules are trained independently and locally, allowing SHRED to generate high-quality segmentations for categories not seen during training. We train and evaluate SHRED with fine-grained segmentations from PartNet; using its merge-threshold hyperparameter, we show that SHRED produces segmentations that better respect ground-truth annotations compared with baseline methods, at any desired decomposition granularity. Finally, we demonstrate that SHRED is useful for downstream applications, out-performing all baselines on zero-shot fine-grained part instance segmentation and few-shot finegrained semantic segmentation when combined with methods that learn to label shape regions.more » « less
-
Scene reconstruction using Monodepth2 (Monocular Depth Inference) which provides depth maps from a single RGB camera, the outputs are filled with noise and inconsistencies. Instance segmentation using a Mask R-CNN (Region Based Convolution Neural Networks) deep model can provide object segmentation results in 2D but lacks 3D information. In this paper we propose to integrate the results of Instance segmentation via Mask R-CNN’s, CAD model Car Shape Alignment, and depth from Monodepth2 together with classical dynamic vision techniques to create a High-level Semantic Model with separability, robustness, consistency and saliency. The model is useful for both virtualized rendering, semantic augmented reality and automatic driving. Experimental results are provided to validate the approach.more » « less
-
7T magnetic resonance imaging (MRI) has the potential to drive our understanding of human brain function through new contrast and enhanced resolution. Whole brain segmentation is a key neuroimaging technique that allows for region-by-region analysis of the brain. Segmentation is also an important preliminary step that provides spatial and volumetric information for running other neuroimaging pipelines. Spatially localized atlas network tiles (SLANT) is a popular 3D convolutional neural network (CNN) tool that breaks the whole brain segmentation task into localized sub-tasks. Each sub-task involves a specific spatial location handled by an independent 3D convolutional network to provide high resolution whole brain segmentation results. SLANT has been widely used to generate whole brain segmentations from structural scans acquired on 3T MRI. However, the use of SLANT for whole brain segmentation from structural 7T MRI scans has not been successful due to the inhomogeneous image contrast usually seen across the brain in 7T MRI. For instance, we demonstrate the mean percent difference of SLANT label volumes between a 3T scan-rescan is approximately 1.73%, whereas its 3T-7T scan-rescan counterpart has higher differences around 15.13%. Our approach to address this problem is to register the whole brain segmentation performed on 3T MRI to 7T MRI and use this information to finetune SLANT for structural 7T MRI. With the finetuned SLANT pipeline, we observe a lower mean relative difference in the label volumes of ~8.43% acquired from structural 7T MRI data. Dice similarity coefficient between SLANT segmentation on the 3T MRI scan and the after finetuning SLANT segmentation on the 7T MRI increased from 0.79 to 0.83 with p<0.01. These results suggest finetuning of SLANT is a viable solution for improving whole brain segmentation on high resolution 7T structural imaging.more » « less