skip to main content


Title: Directionally Convolutional Networks for 3D Shape Segmentation
Previous approaches on 3D shape segmentation mostly rely on heuristic processing and hand-tuned geometric descriptors. In this paper, we propose a novel 3D shape representation learning approach, Directionally Convolutional Network (DCN), to solve the shape segmentation problem. DCN extends convolution operations from images to the surface mesh of 3D shapes. With DCN, we learn effective shape representations from raw geometric features, i.e., face normals and distances, to achieve robust segmentation. More specifically, a two-stream segmentation framework is proposed: one stream is made up by the proposed DCN with the face normals as the input, and the other stream is implemented by a neural network with the face distance histogram as the input. The learned shape representations from the two streams are fused by an element-wise product. Finally, Conditional Random Field (CRF) is applied to optimize the segmentation. Through extensive experiments conducted on benchmark datasets, we demonstrate that our approach outperforms the current state-of-the-arts (both classic and deep learning-based) on a large variety of 3D shapes.  more » « less
Award ID(s):
1657364
NSF-PAR ID:
10058462
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
IEEE International Conference on Computer Vision (ICCV)
Page Range / eLocation ID:
2698 - 2707
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Learning pose invariant representation is a fundamental problem in shape analysis. Most existing deep learning algorithms for 3D shape analysis are not robust to rotations and are often trained on synthetic datasets consisting of pre-aligned shapes, yielding poor generalization to unseen poses. This observation motivates a growing interest in rotation invariant and equivariant methods. The field of rotation equivariant deep learning is developing in recent years thanks to a well established theory of Lie group representations and convolutions. A fundamental problem in equivariant deep learning is to design activation functions which are both informative and preserve equivariance. The recently introduced Tensor Field Network (TFN) framework provides a rotation equivariant network design for point cloud analysis. TFN features undergo a rotation in feature space given a rotation of the input pointcloud. TFN and similar designs consider nonlinearities which operate only over rotation invariant features such as the norm of equivariant features to preserve equivariance, making them unable to capture the directional information. In a recent work entitled "Gauge Equivariant Mesh CNNs: Anisotropic Convolutions on Geometric Graphs" Hann et al. interpret 2D rotation equivariant features as Fourier coefficients of functions on the circle. In this work we transpose the idea of Hann et al. to 3D by interpreting TFN features as spherical harmonics coefficients of functions on the sphere. We introduce a new equivariant nonlinearity and pooling for TFN. We show improvments over the original TFN design and other equivariant nonlinearities in classification and segmentation tasks. Furthermore our method is competitive with state of the art rotation invariant methods in some instances. 
    more » « less
  2. Single image 3D face reconstruction with accurate geometric details is a critical and challenging task due to the similar appearance on the face surface and fine details in organs. In this work, we introduce a self-supervised 3D face reconstruction approach from a single image that can recover detailed textures under different camera settings. The proposed network learns high-quality disparity maps from stereo face images during the training stage, while just a single face image is required to generate the 3D model in real applications. To recover fine details of each organ and facial surface, the framework introduces facial landmark spatial consistency to constrain the face recovering learning process in local point level and segmentation scheme on facial organs to constrain the correspondences at the organ level. The face shape and textures will further be refined by establishing holistic constraints based on the varying light illumination and shading information. The proposed learning framework can recover more accurate 3D facial details both quantitatively and qualitatively compared with state-of-the-art 3DMM and geometry-based reconstruction algorithms based on a single image. 
    more » « less
  3. This paper introduces a deep neural network based method, i.e., DeepOrganNet, to generate and visualize fully high-fidelity 3D / 4D organ geometric models from single-view medical images with complicated background in real time. Traditional 3D / 4D medical image reconstruction requires near hundreds of projections, which cost insufferable computational time and deliver undesirable high imaging / radiation dose to human subjects. Moreover, it always needs further notorious processes to segment or extract the accurate 3D organ models subsequently. The computational time and imaging dose can be reduced by decreasing the number of projections, but the reconstructed image quality is degraded accordingly. To our knowledge, there is no method directly and explicitly reconstructing multiple 3D organ meshes from a single 2D medical grayscale image on the fly. Given single-view 2D medical images, e.g., 3D / 4D-CT projections or X-ray images, our end-to-end DeepOrganNet framework can efficiently and effectively reconstruct 3D / 4D lung models with a variety of geometric shapes by learning the smooth deformation fields from multiple templates based on a trivariate tensor-product deformation technique, leveraging an informative latent descriptor extracted from input 2D images. The proposed method can guarantee to generate high-quality and high-fidelity manifold meshes for 3D / 4D lung models; while, all current deep learning based approaches on the shape reconstruction from a single image cannot. The major contributions of this work are to accurately reconstruct the 3D organ shapes from 2D single-view projection, significantly improve the procedure time to allow on-the-fly visualization, and dramatically reduce the imaging dose for human subjects. Experimental results are evaluated and compared with the traditional reconstruction method and the state-of-the-art in deep learning, by using extensive 3D and 4D examples, including both synthetic phantom and real patient datasets. The efficiency of the proposed method shows that it only needs several milliseconds to generate organ meshes with 10K vertices, which has great potential to be used in real-time image guided radiation therapy (IGRT). 
    more » « less
  4. Reconstructing the 3D shape of objects observed in a single image is a challenging task. Recent approaches rely on visual cues extracted from a given image learned from a deep net. In this work, we leverage recent advances in monocular scene understanding to incorporate an additional geometric cue of surface normals. For this, we proposed a novel optimization layer that encourages the face normals of the reconstructed shape to be aligned with estimated surface normals. We develop a computationally efficient conjugate-gradient-based method that avoids the computation of a high-dimensional sparse matrix. We show this framework to achieve compelling shape reconstruction results on the challenging Pix3D and ShapeNet datasets. 
    more » « less
  5. Many industries, such as human-centric product manufacturing, are calling for mass customization with personalized products. One key enabler of mass customization is 3D printing, which makes flexible design and manufacturing possible. However, the personalized designs bring challenges for the shape matching and analysis, owing to the high complexity and shape variations. Traditional shape matching methods are limited to spatial alignment and finding a transformation matrix for two shapes, which cannot determine a vertex-to-vertex or feature-to-feature correlation between the two shapes. Hence, such a method cannot measure the deformation of the shape and interested features directly. To measure the deformations widely seen in the mass customization paradigm and address the issues of alignment methods in shape matching, we identify the geometry matching of deformed shapes as a correspondence problem. The problem is challenging due to the huge solution space and nonlinear complexity, which is difficult for conventional optimization methods to solve. According to the observation that the well-established massive databases provide the correspondence results of the treated teeth models, a learning-based method is proposed for the shape correspondence problem. Specifically, a state-of-the-art geometric deep learning method is used to learn the correspondence of a set of collected deformed shapes. Through learning the deformations of the models, the underlying variations of the shapes are extracted and used for finding the vertex-to-vertex mapping among these shapes. We demonstrate the application of the proposed approach in the orthodontics industry, and the experimental results show that the proposed method can predict correspondence fast and accurate, also robust to extreme cases. Furthermore, the proposed method is favorably suitable for deformed shape analysis in mass customization enabled by 3D printing. 
    more » « less