Neural Groundplans: Persistent Neural Scene Representations from a Single Image

Sharma, Prafull; Tewari, Ayush; Du, Yilun; Zakharov, Sergey; Ambrus, Rares Andrei; Gaidon, Adrien; Freeman, William T.; Durand, Fredo; Tenenbaum, Joshua B.; Sitzmann, Vincent

Citation Details

We present a method to map 2D image observations of a scene to a persistent 3D scene representation, enabling novel view synthesis and disentangled representation of the movable and immovable components of the scene. Motivated by the bird’s-eye-view (BEV) representation commonly used in vision and robotics, we propose conditional neural groundplans, ground-aligned 2D feature grids, as persistent and memory-efficient scene representations. Our method is trained self-supervised from unlabeled multi-view observations using differentiable rendering, and learns to complete geometry and appearance of occluded regions. In addition, we show that we can leverage multi-view videos at training time to learn to separately reconstruct static and movable components of the scene from a single image at test time. The ability to separately reconstruct movable objects enables a variety of downstream tasks using simple heuristics, such as extraction of object-centric 3D representations, novel view synthesis, instance-level segmentation, 3D bounding box prediction, and scene editing. This highlights the value of neural groundplans as a backbone for efficient 3D scene understanding models. more »

Award ID(s):: 2211260

PAR ID:: 10437394

Author(s) / Creator(s):: Sharma, Prafull; Tewari, Ayush; Du, Yilun; Zakharov, Sergey; Ambrus, Rares Andrei; Gaidon, Adrien; Freeman, William T.; Durand, Fredo; Tenenbaum, Joshua B.; Sitzmann, Vincent

Date Published:: 2023-02-01

Journal Name:: International Conference on Learning Representations

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this