Single image 3D face reconstruction with accurate geometric details is a critical and challenging task due to the similar appearance on the face surface and fine details in organs. In this work, we introduce a self-supervised 3D face reconstruction approach from a single image that can recover detailed textures under different camera settings. The proposed network learns high-quality disparity maps from stereo face images during the training stage, while just a single face image is required to generate the 3D model in real applications. To recover fine details of each organ and facial surface, the framework introduces facial landmark spatial consistency to constrain the face recovering learning process in local point level and segmentation scheme on facial organs to constrain the correspondences at the organ level. The face shape and textures will further be refined by establishing holistic constraints based on the varying light illumination and shading information. The proposed learning framework can recover more accurate 3D facial details both quantitatively and qualitatively compared with state-of-the-art 3DMM and geometry-based reconstruction algorithms based on a single image.
more »
« less
BioFace-3D: continuous 3d facial reconstruction through lightweight single-ear biosensors
Over the last decade, facial landmark tracking and 3D reconstruction have gained considerable attention due to their numerous applications such as human-computer interactions, facial expression analysis, and emotion recognition, etc. Traditional approaches require users to be confined to a particular location and face a camera under constrained recording conditions (e.g., without occlusions and under good lighting conditions). This highly restricted setting prevents them from being deployed in many application scenarios involving human motions. In this paper, we propose the first single-earpiece lightweight biosensing system, BioFace-3D, that can unobtrusively, continuously, and reliably sense the entire facial movements, track 2D facial landmarks, and further render 3D facial animations. Our single-earpiece biosensing system takes advantage of the cross-modal transfer learning model to transfer the knowledge embodied in a high-grade visual facial landmark detection model to the low-grade biosignal domain. After training, our BioFace-3D can directly perform continuous 3D facial reconstruction from the biosignals, without any visual input. Without requiring a camera positioned in front of the user, this paradigm shift from visual sensing to biosensing would introduce new opportunities in many emerging mobile and IoT applications. Extensive experiments involving 16 participants under various settings demonstrate that BioFace-3D can accurately track 53 major facial landmarks with only 1.85 mm average error and 3.38\% normalized mean error, which is comparable with most state-of-the-art camera-based solutions. The rendered 3D facial animations, which are in consistency with the real human facial movements, also validate the system's capability in continuous 3D facial reconstruction.
more »
« less
- Award ID(s):
- 2132112
- PAR ID:
- 10377880
- Date Published:
- Journal Name:
- Proceedings of the 27th Annual International Conference on Mobile Computing and Networking (MobiCom '21)
- Page Range / eLocation ID:
- 350 to 363
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Low-cost 3D scanners and automatic photogrammetry software have brought digitization of objects into 3D models to the level of the consumer. However, the digitization techniques are either tedious, disruptive to the scanned object, or expensive. We create a novel 3D scanning system using consumer grade hardware that revolves a camera around the object of interest. Our approach does not disturb the object during capture and allows us to scan delicate objects that can deform under motion, such as potted plants. Our system consists of a Raspberry Pi camera and computer, stepper motor, 3D printed camera track, and control software. Our 3D scanner allows the user to gather image sets for 3D model reconstruction using photogrammetry software with minimal effort. We scale 3D scanning to objects of varying sizes by designing our scanner using programmatic modeling, and allowing the user to change the physical dimensions of the scanner without redrawing each part.more » « less
-
Abstract Geometric morphometrics is used in the biological sciences to quantify morphological traits. However, the need for manual landmark placement hampers scalability, which is both time‐consuming, labor‐intensive, and open to human error. The selected landmarks embody a specific hypothesis regarding the critical geometry relevant to the biological question. Any adjustment to this hypothesis necessitates acquiring a new set of landmarks or revising them significantly, which can be impractical for large datasets. There is a pressing need for more efficient and flexible methods for landmark placement that can adapt to different hypotheses without requiring extensive human effort. This study investigates the precision and accuracy of landmarks derived from functional correspondences obtained through the functional map framework of geometry processing. We utilize a deep functional map network to learn shape descriptors, which enable us to achieve functional map‐based and point‐to‐point correspondences between specimens in our dataset. Our methodology involves automating the landmarking process by interrogating these maps to identify corresponding landmarks, using manually placed landmarks from the entire dataset as a reference. We apply our method to a dataset of rodent mandibles and compare its performance to MALPACA's, a standard tool for automatic landmark placement. Our model demonstrates a speed improvement compared to MALPACA while maintaining a competitive level of accuracy. Although MALPACA typically shows the lowest RMSE, our models perform comparably well, particularly with smaller training datasets, indicating strong generalizability. Visual assessments confirm the precision of our automated landmark placements, with deviations consistently falling within an acceptable range for MALPACA estimates. Our results underscore the potential of unsupervised learning models in anatomical landmark placement, presenting a practical and efficient alternative to traditional methods. Our approach saves significant time and effort and provides the flexibility to adapt to different hypotheses about critical geometrical features without the need for manual re‐acquisition of landmarks. This advancement can significantly enhance the scalability and applicability of geometric morphometrics, making it more feasible for large datasets and diverse biological studies.more » « less
-
We present an end-to-end method for capturing the dynamics of 3D human characters and translating them for synthesizing new, visually-realistic motion sequences. Conventional methods employ sophisticated, but generic, control approaches for driving the joints of articulated characters, paying little attention to the distinct dynamics of human joint movements. In contrast, our approach attempts to synthesize human-like joint movements by exploiting a biologically-plausible, compact network of spiking neurons that drive joint control in primates and rodents. We adapt the controller architecture by introducing learnable components and propose an evolutionary algorithm for training the spiking neural network architectures and capturing diverse joint dynamics. Our method requires only a few samples for capturing the dynamic properties of a joint's motion and exploits the biologically-inspired, trained controller for its reconstruction. More importantly, it can transfer the captured dynamics to new visually-plausible motion sequences. To enable user-dependent tailoring of the resulting motion sequences, we develop an interactive framework that allows for editing and real-time visualization of the controlled 3D character. We also demonstrate the applicability of our method to real human motion capture data by learning the hand joint dynamics from a gesture dataset and using our framework to reconstruct the gestures with our 3D animated character. The compact architecture of our joint controller emerging from its biologically-realistic design, and the inherent capacity of our evolutionary learning algorithm for parallelization, suggest that our approach could provide an efficient and scalable alternative for synthesizing 3D character animations with diverse and visually-realistic motion dynamics.more » « less
-
This work is motivated by the need to automate the analysis of parent-infant interactions to better understand the existence of any potential behavioral patterns useful for the early diagnosis of autism spectrum disorder (ASD). It presents an approach for synthesizing the facial expression exchanges that occur during parent-infant interactions. This is accomplished by developing a novel approach that uses landmarks when synthesizing changing facial expressions. The proposed model consists of two components: (i) The first is a landmark converter that receives a set of facial landmarks and the target emotion as input and outputs a set of new landmarks transformed to match the emotion. (ii) The second component involves an image converter that takes in an input image, a target landmark and a target emotion and outputs a face transformed to match the input emotion. The inclusion of landmarks in the generation process proves useful in the generation of baby facial expressions; babies have somewhat different facial musculature and facial dynamics than adults. This paper presents a realistic-looking matrix of changing facial expressions sampled from a 2-D emotion continuum (valence and arousal) and displays successfully transferred facial expressions from real-life mother-infant dyads to novel ones.more » « less
An official website of the United States government

