skip to main content


Title: A-CNN: Annularly Convolutional Neural Networks on Point Clouds
Analyzing the geometric and semantic properties of 3D point clouds through the deep networks is still challenging due to the irregularity and sparsity of samplings of their geometric structures. This paper presents a new method to define and compute convolution directly on 3D point clouds by the proposed annular convolution. This new convolution operator can better capture the local neighborhood geometry of each point by specifying the (regular and dilated) ring-shaped structures and directions in the computation. It can adapt to the geometric variability and scalability at the signal processing level. We apply it to the developed hierarchical neural networks for object classification, part segmentation, and semantic segmentation in large-scale scenes. The extensive experiments and comparisons demonstrate that our approach outperforms the state-of-the-art methods on a variety of standard benchmark datasets (e.g., ModelNet10, ModelNet40, ShapeNetpart, S3DIS, and ScanNet).  more » « less
Award ID(s):
1657364 1845962 1816511
NSF-PAR ID:
10097942
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
IEEE Conference on Computer Vision and Pattern Recognition
ISSN:
2163-6648
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Analyzing the geometric and semantic properties of 3D point clouds through the deep networks is still challenging due to the irregularity and sparsity of samplings of their geometric structures. This paper presents a new method to define and compute convolution directly on 3D point clouds by the proposed annular convolution. This new convolution operator can better capture the local neighborhood geometry of each point by specifying the (regular and dilated) ring-shaped structures and directions in the computation. It can adapt to the geometric variability and scalability at the signal processing level. We apply it to the developed hierarchical neural networks for object classification, part segmentation, and semantic segmentation in large-scale scenes. The extensive experiments and comparisons demonstrate that our approach outperforms the state-of-the-art methods on a variety of standard benchmark datasets (e.g., ModelNet10, ModelNet40, ShapeNetpart, S3DIS, and ScanNet). 
    more » « less
  2. Unlike images which are represented in regular dense grids, 3D point clouds are irregular and unordered, hence applying convolution on them can be difficult. In this paper, we extend the dynamic filter to a new convolution operation, named PointConv. PointConv can be applied on point clouds to build deep convolutional networks. We treat convolution kernels as nonlinear functions of the local coordinates of 3D points comprised of weight and density functions. With respect to a given point, the weight functions are learned with multi-layer perceptron networks and the density functions through kernel density estimation. A novel reformulation is proposed for efficiently computing the weight functions, which allowed us to dramatically scale up the network and significantly improve its performance. The learned convolution kernel can be used to compute translation-invariant and permutation-invariant convolution on any point set in the 3D space. Besides, PointConv can also be used as deconvolution operators to propagate features from a subsampled point cloud back to its original resolution. Experiments on ModelNet40, ShapeNet, and ScanNet show that deep convolutional neural networks built on PointConv are able to achieve state-of-the-art on challenging semantic segmentation benchmarks on 3D point clouds. Besides, our experiments converting CIFAR-10 into a point cloud showed that networks built on PointConv can match the performance of convolutional networks in 2D images of a similar structure. 
    more » « less
  3. Abstract

    We present a method that detects boundaries of parts in 3D shapes represented as point clouds. Our method is based on a graph convolutional network architecture that outputs a probability for a point to lie in an area that separates two or more parts in a 3D shape. Our boundary detector is quite generic: it can be trained to localize boundaries of semantic parts or geometric primitives commonly used in 3D modeling. Our experiments demonstrate that our method can extract more accurate boundaries that are closer to ground‐truth ones compared to alternatives. We also demonstrate an application of our network to fine‐grained semantic shape segmentation, where we also show improvements in terms of part labeling performance.

     
    more » « less
  4. 3D CT point clouds reconstructed from the original CT images are naturally represented in real-world coordinates. Compared with CT images, 3D CT point clouds contain invariant geometric features with irregular spatial distributions from multiple viewpoints. This paper rethinks pulmonary nodule detection in CT point cloud representations. We first extract the multi-view features from a sparse convolutional (SparseConv) encoder by rotating the point clouds with different angles in the world coordinate. Then, to simultaneously learn the discriminative and robust spatial features from various viewpoints, a nodule proposal optimization schema is proposed to obtain coarse nodule regions by aggregating consistent nodule proposals prediction from multi-view features. Last, the multi-level features and semantic segmentation features extracted from a SparseConv decoder are concatenated with multi-view features for final nodule region regression. Experiments on the benchmark dataset (LUNA16) demonstrate the feasibility of applying CT point clouds in lung nodule detection task. Furthermore, we observe that by combining multi-view predictions, the performance of the proposed framework is greatly improved compared to single-view, while the interior texture features of nodules from images are more suitable for detecting nodules in small sizes. 
    more » « less
  5. The interaction and dimension of points are two important axes in designing point operators to serve hierarchical 3D models. Yet, these two axes are heterogeneous and challenging to fully explore. Existing works craft point operator under a single axis and reuse the crafted operator in all parts of 3D models. This overlooks the opportunity to better combine point interactions and dimensions by exploiting varying geometry/density of 3D point clouds. In this work, we establish PIDS, a novel paradigm to jointly explore point interactions and point dimensions to serve semantic segmentation on point cloud data. We establish a large search space to jointly consider versatile point interactions and point dimensions. This supports point operators with various geometry/density considerations. The enlarged search space with heterogeneous search components calls for a better ranking of candidate models. To achieve this, we improve the search space exploration by leveraging predictor-based Neural Architecture Search (NAS), and enhance the quality of prediction by assigning unique encoding to heterogeneous search components based on their priors. We thoroughly evaluate the networks crafted by PIDS on two semantic segmentation benchmarks, showing 1% mIOU improvement on SemanticKITTI and S3DIS over state-of-the-art 3D models. 
    more » « less