skip to main content

This content will become publicly available on March 1, 2023

Title: Fractional Motion Estimation for Point Cloud Compression
Motivated by the success of fractional pixel motion in video coding, we explore the design of motion estimation with fractional-voxel resolution for compression of color attributes of dynamic 3D point clouds. Our proposed block-based fractional-voxel motion estimation scheme takes into account the fundamental differences between point clouds and videos, i.e., the irregularity of the distribution of voxels within a frame and across frames. We show that motion compensation can benefit from the higher resolution reference and more accurate displacements provided by fractional precision. Our proposed scheme significantly outperforms comparable methods that only use integer motion. The proposed scheme can be combined with and add sizeable gains to state-of-the-art systems that use transforms such as Region Adaptive Graph Fourier Transform and Region Adaptive Haar Transform.
Authors:
; ; ; ;
Award ID(s):
1956190
Publication Date:
NSF-PAR ID:
10340130
Journal Name:
2022 Data Compression Conference (DCC)
Page Range or eLocation-ID:
369 to 378
Sponsoring Org:
National Science Foundation
More Like this
  1. We introduce chroma subsampling for 3D point cloud attribute compression by proposing a novel technique to sample points irregularly placed in 3D space. While most current video compression standards use chroma subsampling, these chroma subsampling methods cannot be directly applied to 3D point clouds, given their irregularity and sparsity. In this work, we develop a framework to incorporate chroma subsampling into geometry-based point cloud encoders, such as region adaptive hierarchical transform (RAHT) and region adaptive graph Fourier transform (RAGFT). We propose different sampling patterns on a regular 3D grid to sample the points at different rates. We use a simple graph-based nearest neighbor interpolation technique to reconstruct the full resolution point cloud at the decoder end. Experimental results demonstrate that our proposed method provides significant coding gains with negligible impact on the reconstruction quality. For some sequences, we observe a bitrate reduction of 10-15% under the Bjontegaard metric. More generally, perceptual masking makes it possible to achieve larger bitrate reductions without visible changes in quality.
  2. 3D object recognition accuracy can be improved by learning the multi-scale spatial features from 3D spatial geometric representations of objects such as point clouds, 3D models, surfaces, and RGB-D data. Current deep learning approaches learn such features either using structured data representations (voxel grids and octrees) or from unstructured representations (graphs and point clouds). Learning features from such structured representations is limited by the restriction on resolution and tree depth while unstructured representations creates a challenge due to non-uniformity among data samples. In this paper, we propose an end-to-end multi-level learning approach on a multi-level voxel grid to overcome these drawbacks. To demonstrate the utility of the proposed multi-level learning, we use a multi-level voxel representation of 3D objects to perform object recognition. The multi-level voxel representation consists of a coarse voxel grid that contains volumetric information of the 3D object. In addition, each voxel in the coarse grid that contains a portion of the object boundary is subdivided into multiple fine-level voxel grids. The performance of our multi-level learning algorithm for object recognition is comparable to dense voxel representations while using significantly lower memory.
  3. We present an efficient voxelization method to encode the geometry and attributes of 3D point clouds obtained from autonomous vehicles. Due to the circular scanning trajectory of sensors, the geometry of LiDAR point clouds is inherently different from that of point clouds captured from RGBD cameras. Our method exploits these specific properties to representing points in cylindrical coordinates instead of conventional Cartesian coordinates. We demonstrate that Region Adaptive Hierarchical Transform (RAHT) can be extended to this setting, leading to attribute encoding based on a volumetric partition in cylindrical coordinates. Experimental results show that our proposed voxelization outperforms conventional approaches based on Cartesian coordinates for this type of data. We observe a significant improvement in attribute coding performance with 5-10% reduction in bitrate and octree representation with 35-45% reduction in bits.
  4. Aims. Thanks to the high angular resolution, sensitivity, image fidelity, and frequency coverage of ALMA, we aim to improve our understanding of star formation. One of the breakthroughs expected from ALMA, which is the basis of our Cycle 5 ALMA-IMF Large Program, is the question of the origin of the initial mass function (IMF) of stars. Here we present the ALMA-IMF protocluster selection, first results, and scientific prospects. Methods. ALMA-IMF imaged a total noncontiguous area of ~53 pc 2 , covering extreme, nearby protoclusters of the Milky Way. We observed 15 massive (2.5 −33 × 10 3 M ⊙ ), nearby (2−5.5 kpc) protoclusters that were selected to span relevant early protocluster evolutionary stages. Our 1.3 and 3 mm observations provide continuum images that are homogeneously sensitive to point-like cores with masses of ~0.2 M ⊙ and ~0.6 M ⊙ , respectively, with a matched spatial resolution of ~2000 au across the sample at both wavelengths. Moreover, with the broad spectral coverage provided by ALMA, we detect lines that probe the ionized and molecular gas, as well as complex molecules. Taken together, these data probe the protocluster structure, kinematics, chemistry, and feedback over scales from clouds to filaments to cores.more »Results. We classify ALMA-IMF protoclusters as Young (six protoclusters), Intermediate (five protoclusters), or Evolved (four proto-clusters) based on the amount of dense gas in the cloud that has potentially been impacted by H  II region(s). The ALMA-IMF catalog contains ~700 cores that span a mass range of ~0.15 M ⊙ to ~250 M ⊙ at a typical size of ~2100 au. We show that this core sample has no significant distance bias and can be used to build core mass functions (CMFs) at similar physical scales. Significant gas motions, which we highlight here in the G353.41 region, are traced down to core scales and can be used to look for inflowing gas streamers and to quantify the impact of the possible associated core mass growth on the shape of the CMF with time. Our first analysis does not reveal any significant evolution of the matter concentration from clouds to cores (i.e., from 1 pc to 0.01 pc scales) or from the youngest to more evolved protoclusters, indicating that cloud dynamical evolution and stellar feedback have for the moment only had a slight effect on the structure of high-density gas in our sample. Furthermore, the first-look analysis of the line richness toward bright cores indicates that the survey encompasses several tens of hot cores, of which we highlight the most massive in the G351.77 cloud. Their homogeneous characterization can be used to constrain the emerging molecular complexity in protostars of high to intermediate masses. Conclusions. The ALMA-IMF Large Program is uniquely designed to transform our understanding of the IMF origin, taking the effects of cloud characteristics and evolution into account. It will provide the community with an unprecedented database with a high legacy value for protocluster clouds, filaments, cores, hot cores, outflows, inflows, and stellar clusters studies.« less
  5. Unlike images which are represented in regular dense grids, 3D point clouds are irregular and unordered, hence applying convolution on them can be difficult. In this paper, we extend the dynamic filter to a new convolution operation, named PointConv. PointConv can be applied on point clouds to build deep convolutional networks. We treat convolution kernels as nonlinear functions of the local coordinates of 3D points comprised of weight and density functions. With respect to a given point, the weight functions are learned with multi-layer perceptron networks and the density functions through kernel density estimation. A novel reformulation is proposed for efficiently computing the weight functions, which allowed us to dramatically scale up the network and significantly improve its performance. The learned convolution kernel can be used to compute translation-invariant and permutation-invariant convolution on any point set in the 3D space. Besides, PointConv can also be used as deconvolution operators to propagate features from a subsampled point cloud back to its original resolution. Experiments on ModelNet40, ShapeNet, and ScanNet show that deep convolutional neural networks built on PointConv are able to achieve state-of-the-art on challenging semantic segmentation benchmarks on 3D point clouds. Besides, our experiments converting CIFAR-10 into a point cloudmore »showed that networks built on PointConv can match the performance of convolutional networks in 2D images of a similar structure.« less