skip to main content

Title: BioFace-3D: continuous 3d facial reconstruction through lightweight single-ear biosensors
Over the last decade, facial landmark tracking and 3D reconstruction have gained considerable attention due to their numerous applications such as human-computer interactions, facial expression analysis, and emotion recognition, etc. Traditional approaches require users to be confined to a particular location and face a camera under constrained recording conditions (e.g., without occlusions and under good lighting conditions). This highly restricted setting prevents them from being deployed in many application scenarios involving human motions. In this paper, we propose the first single-earpiece lightweight biosensing system, BioFace-3D, that can unobtrusively, continuously, and reliably sense the entire facial movements, track 2D facial landmarks, and further render 3D facial animations. Our single-earpiece biosensing system takes advantage of the cross-modal transfer learning model to transfer the knowledge embodied in a high-grade visual facial landmark detection model to the low-grade biosignal domain. After training, our BioFace-3D can directly perform continuous 3D facial reconstruction from the biosignals, without any visual input. Without requiring a camera positioned in front of the user, this paradigm shift from visual sensing to biosensing would introduce new opportunities in many emerging mobile and IoT applications. Extensive experiments involving 16 participants under various settings demonstrate that BioFace-3D can accurately track 53 major facial more » landmarks with only 1.85 mm average error and 3.38\% normalized mean error, which is comparable with most state-of-the-art camera-based solutions. The rendered 3D facial animations, which are in consistency with the real human facial movements, also validate the system's capability in continuous 3D facial reconstruction. « less
; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Proceedings of the 27th Annual International Conference on Mobile Computing and Networking (MobiCom '21)
Page Range or eLocation-ID:
350 to 363
Sponsoring Org:
National Science Foundation
More Like this
  1. Low-cost 3D scanners and automatic photogrammetry software have brought digitization of objects into 3D models to the level of the consumer. However, the digitization techniques are either tedious, disruptive to the scanned object, or expensive. We create a novel 3D scanning system using consumer grade hardware that revolves a camera around the object of interest. Our approach does not disturb the object during capture and allows us to scan delicate objects that can deform under motion, such as potted plants. Our system consists of a Raspberry Pi camera and computer, stepper motor, 3D printed camera track, and control software. Our 3D scanner allows the user to gather image sets for 3D model reconstruction using photogrammetry software with minimal effort. We scale 3D scanning to objects of varying sizes by designing our scanner using programmatic modeling, and allowing the user to change the physical dimensions of the scanner without redrawing each part.
  2. We present a System for Processing In-situ Bio-signal Data for Emotion Recognition and Sensing (SPIDERS)- a low-cost, wireless, glasses-based platform for continuous in-situ monitoring of user's facial expressions (apparent emotions) and real emotions. We present algorithms to provide four core functions (eye shape and eyebrow movements, pupillometry, zygomaticus muscle movements, and head movements), using the bio-signals acquired from three non-contact sensors (IR camera, proximity sensor, IMU). SPIDERS distinguishes between different classes of apparent and real emotion states based on the aforementioned four bio-signals. We prototype advanced functionalities including facial expression detection and real emotion classification with a landmarks and optical flow based facial expression detector that leverages changes in a user's eyebrows and eye shapes to achieve up to 83.87% accuracy, as well as a pupillometry-based real emotion classifier with higher accuracy than other low-cost wearable platforms that use sensors requiring skin contact. SPIDERS costs less than $20 to assemble and can continuously run for up to 9 hours before recharging. We demonstrate that SPIDERS is a truly wireless and portable platform that has the capability to impact a wide range of applications, where knowledge of the user's emotional state is critical.
  3. Reconstructing 4D vehicular activity (3D space and time) from cameras is useful for autonomous vehicles, commuters and local authorities to plan for smarter and safer cities. Traffic is inherently repetitious over long periods, yet current deep learning-based 3D reconstruction methods have not considered such repetitions and have difficulty generalizing to new intersection-installed cameras. We present a novel approach exploiting longitudinal (long-term) repetitious motion as self-supervision to reconstruct 3D vehicular activity from a video captured by a single fixed camera. Starting from off-the-shelf 2D keypoint detections, our algorithm optimizes 3D vehicle shapes and poses, and then clusters their trajectories in 3D space. The 2D keypoints and trajectory clusters accumulated over long-term are later used to improve the 2D and 3D keypoints via self-supervision without any human annotation. Our method improves reconstruction accuracy over state of the art on scenes with a significant visual difference from the keypoint detector’s training data, and has many applications including velocity estimation, anomaly detection and vehicle counting. We demonstrate results on traffic videos captured at multiple city intersections, collected using our smartphones, YouTube, and other public datasets.
  4. Current collaborative augmented reality (AR) systems establish a common localization coordinate frame among users by exchanging and comparing maps comprised of feature points. However, relative positioning through map sharing struggles in dynamic or feature-sparse environments. It also requires that users exchange identical regions of the map, which may not be possible if they are separated by walls or facing different directions. In this paper, we present Cappella11Like its musical inspiration, Cappella utilizes collaboration among agents to forgo the need for instrumentation, an infrastructure-free 6-degrees-of-freedom (6DOF) positioning system for multi-user AR applications that uses motion estimates and range measurements between users to establish an accurate relative coordinate system. Cappella uses visual-inertial odometry (VIO) in conjunction with ultra-wideband (UWB) ranging radios to estimate the relative position of each device in an ad hoc manner. The system leverages a collaborative particle filtering formulation that operates on sporadic messages exchanged between nearby users. Unlike visual landmark sharing approaches, this allows for collaborative AR sessions even if users do not share the same field of view, or if the environment is too dynamic for feature matching to be reliable. We show that not only is it possible to perform collaborative positioning without infrastructure or globalmore »coordinates, but that our approach provides nearly the same level of accuracy as fixed infrastructure approaches for AR teaming applications. Cappella consists of an open source UWB firmware and reference mobile phone application that can display the location of team members in real time using mobile AR. We evaluate Cappella across mul-tiple buildings under a wide variety of conditions, including a contiguous 30,000 square foot region spanning multiple floors, and find that it achieves median geometric error in 3D of less than 1 meter.« less
  5. As the drone becomes widespread in numerous crucial applications with many powerful functionalities (e.g., reconnaissance and mechanical trigger), there are increasing cases related to misused drones for unethical even criminal activities. Therefore, it is of paramount importance to identify these malicious drones and track their origins using digital forensics. Traditional drone identification techniques for forensics (e.g., RF communication, ID landmarks using a camera, etc.) require high compliance of drones. However, malicious drones will not cooperate or even spoof these identification techniques. Therefore, we present an exploration for a reliable and passive identification approach based on unique hardware traits in drones directly (e.g., analogous to the fingerprint and iris in humans) for forensics purposes. Specifically, we investigate and model the behavior of the parasitic electronic elements under RF interrogation, a particular passive parasitic response modulated by an electronic system on drones, which is distinctive and unlikely to counterfeit. Based on this theory, we design and implement DroneTrace, an end-to-end reliable and passive identification system toward digital drone forensics. DroneTrace comprises a cost-effective millimeter-wave (mmWave) probe, a software framework to extract and process parasitic responses, and a customized deep neural network (DNN)-based algorithm to analyze and identify drones. We evaluate the performancemore »of DroneTrace with 36 commodity drones. Results show that DroneTrace can identify drones with the accuracy of over 99% and an equal error rate (EER) of 0.009, under a 0.1-second sensing time budget. Moreover, we test the reliability, robustness, and performance variation under a set of real-world circumstances, where DroneTrace maintains accuracy of over 98%. DroneTrace is resilient to various attacks and maintains functionality. At its best, DroneTrace has the capacity to identify individual drones at the scale of 104 with less than 5% error.« less