skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Framework for Analyzing the Whole Body Surface Area from a Single View
We present a virtual reality (VR) framework for the analysis of whole human body surface area. Usual methods for determining the whole body surface area (WBSA) are based on well known formulae, characterized by large errors when the subject is obese, or belongs to certain subgroups. For these situations, we believe that a computer vision approach can overcome these problems and provide a better estimate of this important body indicator. Unfortunately, using machine learning techniques to design a computer vision system able to provide a new body indicator that goes beyond the use of only body weight and height, entails a long and expensive data acquisition process. A more viable solution is to use a dataset composed of virtual subjects. Generating a virtual dataset allowed us to build a pop- ulation with different characteristics (obese, underweight, age, gender). However, synthetic data might differ from a real scenario, typical of the physician’s clinic. For this reason we develop a new virtual environment to facilitate the analysis of human subjects in 3D. This framework can simulate the acquisition process of a real camera, making it easy to analyze and to create training data for machine learning algorithms. With this virtual environment, we can easily simulate the real setup of a clinic, where a subject is standing in front of a cam- era, or may assume a different pose with respect to the camera. We use this newly desig- nated environment to analyze the whole body surface area (WBSA). In particular, we show that we can obtain accurate WBSA estimations with just one view, virtually enabling the pos- sibility to use inexpensive depth sensors (e.g., the Kinect) for large scale quantification of the WBSA from a single view 3D map.  more » « less
Award ID(s):
1650474 1066197
PAR ID:
10053838
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
PloS one
Volume:
12
Issue:
1
ISSN:
1932-6203
Page Range / eLocation ID:
e0166749
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The Glimpse.3D is a body-worn camera that captures, processes, stores, and transmits 3D visual information of a real-world environment using a low cost camera-based sensor system that is constrained by its limited processing capability, storage, and battery life. The 3D content is viewed on a mobile device such as a smartphone or a virtual reality headset. This system can be used in applications such as capturing and sharing 3D content in the social media, training people in different professions, and post-facto analysis of an event. Glimpse.3D uses off-the-shelf hardware and standard computer vision algorithms. Its novelty lies in the ability to optimally control camera data acquisition and processing stages to guarantee the desired quality of captured information and battery life. The design of the controller is based on extensive measurements and modeling of the relationships between the linear and angular motion of a body-worn camera and the quality of generated 3D point clouds as well as the battery life of the system. To achieve this, we 1) devise a new metric to quantify the quality of generated 3D point clouds, 2) formulate an optimization problem to find an optimal trigger point for the camera system that prolongs its battery life while maximizing the quality of captured 3D environment, and 3) make the model adaptive so that the system evolves and its performance improves over time. 
    more » « less
  2. Full surround 3D imaging for shape acquisition is essential for generating digital replicas of real-world objects. Surrounding an object we seek to scan with a kaleidoscope, that is, a configuration of multiple planar mirrors, produces an image of the object that encodes information from a combinatorially large number of virtual viewpoints. This information is practically useful for the full surround 3D reconstruction of the object, but cannot be used directly, as we do not know what virtual viewpoint each image pixel corresponds---the pixel label. We introduce a structured light system that combines a projector and a camera with a kaleidoscope. We then prove that we can accurately determine the labels of projector and camera pixels, for arbitrary kaleidoscope configurations, using the projector-camera epipolar geometry. We use this result to show that our system can serve as a multi-view structured light system with hundreds of virtual projectors and cameras. This makes our system capable of scanning complex shapes precisely and with full coverage. We demonstrate the advantages of the kaleidoscopic structured light system by scanning objects that exhibit a large range of shapes and reflectances. 
    more » « less
  3. This paper introduces MultiMesh, a multi-subject 3D human mesh construction system based on commodity WiFi. Our system can reuse commodity WiFi devices in the environment and is capable of working in non-line-of-sight (NLoS) conditions compared with the traditional computer vision-based approach. Specifically, we leverage an L-shaped antenna array to generate the two-dimensional angle of arrival (2D AoA) of reflected signals for subject separation in the physical space. We further leverage the angle of departure and time of flight of the signal to enhance the resolvability for precise separation of close subjects. Then we exploit information from various signal dimensions to mitigate the interference of indirect reflections according to different signal propagation paths. Moreover, we employ the continuity of human movement in the spatial-temporal domain to track weak reflected signals of faraway subjects. Finally, we utilize a deep learning model to digitize 2D AoA images of each subject into the 3D human mesh. We conducted extensive experiments in real-world multi-subject scenarios under various environments to evaluate the performance of our system. For example, we conduct experiments with occlusion and perform human mesh construction for different distances between two subjects and different distances between subjects and WiFi devices. The results show that MultiMesh can accurately construct 3D human meshes for multiple users with an average vertex error of 4cm. The evaluations also demonstrate that our system could achieve comparable performance for unseen environments and people. Moreover, we also evaluate the accuracy of spatial information extraction and the performance of subject detection. These evaluations demonstrate the robustness and effectiveness of our system. 
    more » « less
  4. How to effectively represent camera pose is an essential problem in 3D computer vision, especially in tasks such as camera pose regression and novel view synthesis. Traditionally, 3D position of the camera is represented by Cartesian coordinate and the orientation is represented by Euler angle or quaternions. These representations are manually designed, which may not be the most effective representation for downstream tasks. In this work, we propose an approach to learn neural representations of camera poses and 3D scenes, coupled with neural representations of local camera movements. Specifically, the camera pose and 3D scene are represented as vectors and the local camera movement is represented as a matrix operating on the vector of the camera pose. We demonstrate that the camera movement can further be parametrized by a matrix Lie algebra that underlies a rotation system in the neural space. The vector representations are then concatenated and generate the posed 2D image through a decoder network. The model is learned from only posed 2D images and corresponding camera poses, without access to depths or shapes. We conduct extensive experiments on synthetic and real datasets. The results show that compared with other camera pose representations, our learned representation is more robust to noise in novel view synthesis and more effective in camera pose regression. 
    more » « less
  5. Tan, Jie; Toussaint, Marc; Darvish, Kourosh (Ed.)
    Contacts play a critical role in most manipulation tasks. Robots today mainly use proximal touch/force sensors to sense contacts, but the information they provide must be calibrated and is inherently local, with practical applications relying either on extensive surface coverage or restrictive assumptions to resolve ambiguities. We propose a vision-based extrinsic contact localization task: with only a single RGB-D camera view of a robot workspace, identify when and where an object held by the robot contacts the rest of the environment. We show that careful task-attuned design is critical for a neural network trained in simulation to discover solutions that transfer well to a real robot. Our final approach im2contact demonstrates the promise of versatile general-purpose contact perception from vision alone, performing well for localizing various contact types (point, line, or planar; sticking, sliding, or rolling; single or multiple), and even under occlusions in its camera view. Video results can be found at: https://sites.google.com/view/im2contact/home 
    more » « less