skip to main content


This content will become publicly available on June 1, 2024

Title: Building Rearticulable Models for Arbitrary 3D Objects from 4D Point Clouds
We build rearticulable models for arbitrary everyday man-made objects containing an arbitrary number of parts that are connected together in arbitrary ways via 1 degree-of-freedom joints. Given point cloud videos of such everyday objects, our method identifies the distinct object parts, what parts are connected to what other parts, and the properties of the joints connecting each part pair. We do this by jointly optimizing the part segmentation, transformation, and kinematics using a novel energy minimization framework. Our inferred animatable models, enables retargeting to novel poses with sparse point correspondences guidance. We test our method on a new articulating robot dataset, and the Sapiens dataset with common daily objects, as well as real-world scans. Experiments show that our method outperforms two leading prior works on various metrics.  more » « less
Award ID(s):
2143873
NSF-PAR ID:
10418170
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Computer Vision and Pattern Recognition (CVPR)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Manipulating an articulated object requires perceiving its kinematic hierarchy: its parts, how each can move, and how those motions are coupled. Previous work has explored perception for kinematics, but none infers a complete kinematic hierarchy on never-before-seen object instances, without relying on a schema or template. We present a novel perception system that achieves this goal. Our system infers the moving parts of an object and the kinematic couplings that relate them. To infer parts, it uses a point cloud instance segmentation neural network and to infer kinematic hierarchies, it uses a graph neural network to predict the existence, direction, and type of edges (i.e. joints) that relate the inferred parts. We train these networks using simulated scans of synthetic 3D models. We evaluate our system on simulated scans of 3D objects, and we demonstrate a proof-of-concept use of our system to drive real-world robotic manipulation. 
    more » « less
  2. Predicting the pose of objects from a single image is an important but difficult computer vision problem. Methods that predict a single point estimate do not predict the pose of objects with symmetries well and cannot represent uncertainty. Alternatively, some works predict a distribution over orientations in SO(3). However, training such models can be computation- and sample-inefficient. Instead, we propose a novel mapping of features from the image domain to the 3D rotation manifold. Our method then leverages SO(3) equivariant layers, which are more sample efficient, and outputs a distribution over rotations that can be sampled at arbitrary resolution. We demonstrate the effectiveness of our method at object orientation prediction, and achieve state-of-the-art performance on the popular PASCAL3D+ dataset. Moreover, we show that our method can model complex object symmetries, without any modifications to the parameters or loss function. Code is available at https://dmklee.github.io/image2sphere/ 
    more » « less
  3. Abstract We generalize the Guth–Katz joints theorem from lines to varieties. A special case says that N planes (2-flats) in 6 dimensions (over any field) have $$O(N^{3/2})$$ O ( N 3 / 2 ) joints, where a joint is a point contained in a triple of these planes not all lying in some hyperplane. More generally, we prove the same bound when the set of N planes is replaced by a set of 2-dimensional algebraic varieties of total degree N , and a joint is a point that is regular for three varieties whose tangent planes at that point are not all contained in some hyperplane. Our most general result gives upper bounds, tight up to constant factors, for joints with multiplicities for several sets of varieties of arbitrary dimensions (known as Carbery’s conjecture). Our main innovation is a new way to extend the polynomial method to higher dimensional objects, relating the degree of a polynomial and its orders of vanishing on a given set of points on a variety. 
    more » « less
  4. We introduce an interactive system for extracting the geometries of generalized cylinders and cuboids from singleor multiple-view point clouds. Our proposed method is intuitive and only requires the object’s silhouettes to be traced by the user. Leveraging the user’s perceptual understanding of what an object looks like, our proposed method is capable of extracting accurate models, even in the presence of occlusion, clutter or incomplete point cloud data, while preserving the original object’s details and scale. We demonstrate the merits of our proposed method through a set of experiments on a public RGBD dataset. We extracted 16 objects from the dataset using at most two views of each object. Our extracted models represent a high degree of visual similarity to the original objects. Further, we achieved a mean normalized Hausdorff distance of 5.66% when comparing our extracted models with the dataset’s ground truths. 
    more » « less
  5. We introduce an interactive system for extracting the geometries of generalized cylinders and cuboids from single or multiple-view point clouds. Our proposed method is intuitive and only requires the object’s silhouettes to be traced by the user. Leveraging the user’s perceptual understanding of what an object looks like, our proposed method is capable of extracting accurate models, even in the presence of occlusion, clutter or incomplete point cloud data, while preserving the original object’s details and scale. We demonstrate the merits of our proposed method through a set of experiments on a public RGB-D dataset. We extracted 16 objects from the dataset using at most two views of each object. Our extracted models represent a high degree of visual similarity to the original objects. Further, we achieved a mean normalized Hausdorff distance of 5.66% when comparing our extracted models with the dataset’s ground truths. 
    more » « less