Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
The ability to estimate the 3D human shape and pose from images can be useful in many contexts. Recent approaches have explored using graph convolutional networks and achieved promising results. The fact that the 3D shape is represented by a mesh, an undirected graph, makes graph convolutional networks a natural fit for this problem. However, graph convolutional networks have limited representation power Information from nodes in the graph is passed to connected neighbors, and propagation of information requires successive graph convolutions. To overcome this limitation, we propose a dual-scale graph approach. We use a coarse graph, derived from a dense graph, to estimate the human’s 3D pose, and the dense graph to estimate the 3D shape. Information in coarse graphs can be propagated over longer distances compared to dense graphs. In addition, information about pose can guide to recover local shape detail and vice versa. We recognize that the connection between coarse and dense is itself a graph, and introduce graph fusion blocks to exchange information between graphs with different scales. We train our model end-to-end and show that we can achieve state-of-the-art results for several evaluation datasets. The code is available at the following link, https://github.com/yuxwind/BiGraphBody.
Recovering rigid registration between successive camera poses lies at the heart of 3D reconstruction, SLAM and visual odometry. Registration relies on the ability to compute discriminative 2D features in successive camera images for determining feature correspondences, which is very challenging in feature-poor environments, i.e. low-texture and/or low-light environments. In this paper, we aim to address the challenge of recovering rigid registration between successive camera poses in feature-poor environments in a Visual Inertial Odometry (VIO) setting. In addition to inertial sensing, we instrument a small aerial robot with an RGBD camera and propose a framework that unifies the incorporation of 3D geometric entities: points, lines, and planes. The tracked 3D geometric entities provide constraints in an Extended Kalman Filtering framework. We show that by directly exploiting 3D geometric entities, we can achieve improved registration. We demonstrate our approach on different texture-poor environments, with some containing only flat texture-less surfaces providing essentially no 2D features for tracking. In addition, we evaluate how the addition of different 3D geometric entities contributes to improved pose estimation by comparing an estimated pose trajectory to a ground truth pose trajectory obtained from a motion capture system. We consider computationally efficient methods for detecting 3D points, linesmore »