skip to main content


Title: Mapping of Sparse 3D Data Using Alternating Projection
We propose a novel technique to register sparse 3D scans in the absence of texture. While existing methods such as KinectFusion or Iterative Closest Points (ICP) heavily rely on dense point clouds, this task is particularly challenging under sparse conditions without RGB data. Sparse texture-less data does not come with high-quality boundary signal, and this prohibits the use of correspondences from corners, junctions, or boundary lines. Moreover, in the case of sparse data, it is incorrect to assume that the same point will be captured in two consecutive scans. We take a different approach and first re-parameterize the point-cloud using a large number of line segments. In this re-parameterized data, there exists a large number of line intersection (and not correspondence) constraints that allow us to solve the registration task. We propose the use of a two-step alternating projection algorithm by formulating the registration as the simultaneous satisfaction of intersection and rigidity constraints. The proposed approach outperforms other top-scoring algorithms on both Kinect and LiDAR datasets. In Kinect, we can use 100X downsampled sparse data and still outperform competing methods operating on full-resolution data.  more » « less
Award ID(s):
1764071
NSF-PAR ID:
10296127
Author(s) / Creator(s):
; ; ; ;
Editor(s):
Ishikawa, H.; Liu, CL.; Pajdla, T.; Shi, J.
Date Published:
Journal Name:
Lecture notes in computer science
Volume:
12622
ISSN:
0302-9743
Page Range / eLocation ID:
295 - 313
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. 3D scan registration is a classical, yet a highly useful problem in the context of 3D sensors such as Kinect and Velodyne. While there are several existing methods, the techniques are usually incremental where adjacent scans are registered first to obtain the initial poses, followed by motion averaging and bundle-adjustment refinement. In this paper, we take a different approach and develop minimal solvers for jointly computing the initial poses of cameras in small loops such as 3-, 4-, and 5-cycles1. Note that the classical registration of 2 scans can be done using a minimum of 3 point matches to compute 6 degrees of relative motion. On the other hand, to jointly compute the 3D reg- istrations in n-cycles, we take 2 point matches between the first nāˆ’1 consecutive pairs (i.e., Scan 1 & Scan 2, . . . , and Scan n āˆ’ 1 & Scan n) and 1 or 2 point matches between Scan 1 and Scan n. Overall, we use 5, 7, and 10 point matches for 3-, 4-, and 5-cycles, and recover 12, 18, and 24 degrees of transformation variables, respectively. Using simulations and real-data we show that the 3D registration using mini n-cycles are computationally efficient, and can provide alternate and better initial poses compared to standard pairwise methods. 
    more » « less
  2. null (Ed.)
    This paper addresses the problem of learning to complete a scene's depth from sparse depth points and images of indoor scenes. Specifically, we study the case in which the sparse depth is computed from a visual-inertial simultaneous localization and mapping (VI-SLAM) system. The resulting point cloud has low density, it is noisy, and has nonuniform spatial distribution, as compared to the input from active depth sensors, e.g., LiDAR or Kinect. Since the VI-SLAM produces point clouds only over textured areas, we compensate for the missing depth of the low-texture surfaces by leveraging their planar structures and their surface normals which is an important intermediate representation. The pre-trained surface normal network, however, suffers from large performance degradation when there is a significant difference in the viewing direction (especially the roll angle) of the test image as compared to the trained ones. To address this limitation, we use the available gravity estimate from the VI-SLAM to warp the input image to the orientation prevailing in the training dataset. This results in a significant performance gain for the surface normal estimate, and thus the dense depth estimates. Finally, we show that our method outperforms other state-of-the-art approaches both on training (ScanNet [1] and NYUv2 [2]) and testing (collected with Azure Kinect [3]) datasets. 
    more » « less
  3. Abstract Background and Aims Terrestrial laser scanners (TLSs) have successfully captured various properties of individual trees and have potential to further increase the quality and efficiency of forest surveys. However, TLSs are limited to line of sight observations, and forests are complex structural environments that can occlude TLS beams and thereby cause incomplete TLS samples. We evaluate the prevalence and sources of occlusion that limit line of sight to forest stems for TLS scans, assess the impacts of TLS sample incompleteness, and evaluate sampling strategies and data analysis techniques aimed at improving sample quality and representativeness. Methods We use a large number of TLS scans (761), taken across a 255 650-m2 area of forest with detailed field survey data: the Harvard Forest Global Earth Observatory (ForestGEO) (MA, USA). Sets of TLS returns are matched to stem positions in the field surveys to derive TLS-observed stem sets, which are compared with two additional stem sets derived solely from the field survey data: a set of stems within a fixed range from the TLS and a set of stems based on 2-D modelling of line of sight. Stem counts and densities are compared between the stem sets, and four alternative derivations of area to correct stem densities for the effects of occlusion are evaluated. Representation of diameter at breast height and species, drawn from the field survey data, are also compared between the stem sets. Key Results Occlusion from non-stem sources was the major influence on TLS line of sight. Transect and point TLS samples demonstrated better representativeness of some stem properties than did plots. Deriving sampled area from TLS scans improved estimates of stem density. Conclusions TLS sampling efforts should consider alternative sampling strategies and move towards in-progress assessment of sample quality and dynamic adaptation of sampling. 
    more » « less
  4. In recent years, LiDAR sensors have become pervasive in the solutions to localization tasks for autonomous systems. One key step in using LiDAR data for localization is the alignment of two LiDAR scans taken from different poses, a process called scan-matching or point cloud registration. Most existing algorithms for this problem are heuristic in nature and local, meaning they may not produce accurate results under poor initialization. Moreover, existing methods give no guarantee on the quality of their output, which can be detrimental for safety-critical tasks. In this paper, we analyze a simple algorithm for point cloud registration, termed PASTA. This algorithm is global and does not rely on point-to-point correspondences, which are typically absent in LiDAR data. Moreover, and to the best of our knowledge, we offer the first point cloud registration algorithm with provable error bounds. Finally, we illustrate the proposed algorithm and error bounds in simulation on a simple trajectory tracking task. 
    more » « less
  5. Recovering rigid registration between successive camera poses lies at the heart of 3D reconstruction, SLAM and visual odometry. Registration relies on the ability to compute discriminative 2D features in successive camera images for determining feature correspondences, which is very challenging in feature-poor environments, i.e. low-texture and/or low-light environments. In this paper, we aim to address the challenge of recovering rigid registration between successive camera poses in feature-poor environments in a Visual Inertial Odometry (VIO) setting. In addition to inertial sensing, we instrument a small aerial robot with an RGBD camera and propose a framework that unifies the incorporation of 3D geometric entities: points, lines, and planes. The tracked 3D geometric entities provide constraints in an Extended Kalman Filtering framework. We show that by directly exploiting 3D geometric entities, we can achieve improved registration. We demonstrate our approach on different texture-poor environments, with some containing only flat texture-less surfaces providing essentially no 2D features for tracking. In addition, we evaluate how the addition of different 3D geometric entities contributes to improved pose estimation by comparing an estimated pose trajectory to a ground truth pose trajectory obtained from a motion capture system. We consider computationally efficient methods for detecting 3D points, lines and planes, since our goal is to implement our approach on small mobile robots, such as drones. 
    more » « less