skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Online Manhattan Keyframe-based Dense Reconstruction from Indoor Depth Sequences
We present an online framework for dense 3D reconstruction of indoor scenes using sequential Manhattan keyframes. We take advantage of the global geometry of indoor scenes extracted by Manhattan frames and pose optimization to enhance the accuracy and robustness of the reconstructed models. During sequential reconstruction, a Manhattan frame is extracted for each keyframe after surface normal adjustment, and used for a Manhattan keyframe-based planar alignment to initialize the surface registration while a pose graph optimization is used to refine camera poses. The final model is created by integrating the Manhattan keyframes into the unified volumetric model using refined pose estimations. Experimental results demonstrate the advantage of our geometry-based approach to reduce the cumulative registration error and overall geometric drift.  more » « less
Award ID(s):
1910993
PAR ID:
10195285
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
2019 IEEE International Conference on Visual Communications and Image Processing (VCIP)
Page Range / eLocation ID:
1 to 4
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Optical flow-based video interpolation is a commonly used method to enrich video details by generating new intermediate frames from existing ones. However, it cannot accurately reproduce the trajectories of irregular and fast-moving objects. To achieve the accurate reconstruction of high-speed scenes, incorporating event information during the interpolation process is an effective method. However, existing methods are not suitable for scenarios where events are sparse. To implement high-quality video interpolation for these scenarios, this study incorporates event information into the interpolation process with three-fold ideas: a) treating the moving target as the foreground and using events to delineate the target area, b) matching foreground to background based on the event information to generate temporal frame between keyframes, and c) generating intermediate frames between keyframe and temporal frame with optical flow interpolation. Empirical studies indicate that owing to its efficient incorporation of event information, the proposed framework outperforms state-of-the-art methods in generating high-quality frames for video interpolation. 
    more » « less
  2. null (Ed.)
    Dynamic 3D reconstruction of surgical cavities is essential in a wide range of computer-assisted surgical intervention applications, including but not limited to surgical guidance, pre-operative image registration and vision-based force estimation. According to a survey on vision based 3D reconstruction for abdominal minimally invasive surgery (MIS) [1], real-time 3D reconstruction and tissue deformation recovery remain open challenges to researchers. The main challenges include specular reflections from the wet tissue surface and the highly dynamic nature of abdominal surgical scenes. This work aims to overcome these obstacles by using multiple viewpoint and independently moving RGB cameras to generate an accurate measurement of tissue deformation at the volume of interest (VOI), and proposes a novel efficient camera pairing algorithm. Experimental results validate the proposed camera grouping and pair sequencing, and were evaluated with the Raven-II [2] surgical robot system for tool navigation, the Medtronic Stealth Station s7 surgical navigation system for real- time camera pose monitoring, and the Space Spider white light scanner to derive the ground truth 3D model. 
    more » « less
  3. Visual localization, the task of determining the position and orientation of a camera, typically involves three core components: offline construction of a keyframe database, efficient online keyframes retrieval, and robust local feature matching. However, significant challenges arise when there are large viewpoint disparities between the query view and the database, such as attempting localization in a corridor previously build from an opposing direction. Intuitively, this issue can be addressed by synthesizing a set of virtual keyframes that cover all viewpoints. However, existing methods for synthesizing novel views to assist localization often fail to ensure geometric accuracy under large viewpoint changes. In this paper, we introduce a confidence-aware geometric prior into 2D Gaussian splatting to ensure the geometric accuracy of the scene. Then we can render novel views through the mesh with clear structures and accurate geometry, even under significant viewpoint changes, enabling the synthesis of a comprehensive set of virtual keyframes. Incorporating this geometry-preserving virtual keyframe database into the localization pipeline significantly enhances the robustness of visual localization. 
    more » « less
  4. Umetani, N.; Wojtan, C.; Vouga, E. (Ed.)
    Most non-photorealistic rendering (NPR) methods for line drawing synthesis operate on a static shape. They are not tailored to process animated 3D models due to extensive per-frame parameter tuning needed to achieve the intended look and natural transition. This paper introduces a framework for interactive line drawing synthesis from animated 3D models based on a learned style space for drawing representation and interpolation. We refer to style as the relationship between stroke placement in a line drawing and its corresponding geometric properties. Starting from a given sequence of an animated 3D character, a user creates drawings for a set of keyframes. Our system embeds the raster drawings into a latent style space after they are disentangled from the underlying geometry. By traversing the latent space, our system enables a smooth transition between the input keyframes. The user may also edit, add, or remove the keyframes interactively, similar to a typical keyframe-based workflow. We implement our system with deep neural networks trained on synthetic line drawings produced by a combination of NPR methods. Our drawing-specific supervision and optimization-based embedding mechanism allow generalization from NPR line drawings to user-created drawings during run time. Experiments show that our approach generates high-quality line drawing animations while allowing interactive control of the drawing style across frames. 
    more » « less
  5. Face registration is a major and critical step for face analysis. Existing facial activity recognition systems often employ coarse face alignment based on a few fiducial points such as eyes and extract features from equal-sized grid. Such extracted features are susceptible to variations in face pose, facial deformation, and person-specific geometry. In this work, we propose a novel face registration method named facial grid transformation to improve feature extraction for recognizing facial Action Units (AUs). Based on the transformed grid, novel grid edge features are developed to capture local facial motions related to AUs. Extensive experiments on two wellknown AU-coded databases have demonstrated that the proposed method yields significant improvements over the methods based on equal-sized grid on both posed and more importantly, spontaneous facial displays. Furthermore, the proposed method also outperforms the state-of-the-art methods using either coarse alignment or mesh-based face registration. 
    more » « less