skip to main content

Title: Real-time Semantic 3D Reconstruction for High-Touch Surface Recognition for Robotic Disinfection
Disinfection robots have applications in promoting public health and reducing hospital acquired infections and have drawn considerable interest due to the COVID-19 pandemic. To disinfect a room quickly, motion planning can be used to plan robot disinfection trajectories on a reconstructed 3D map of the room’s surfaces. However, existing approaches discard semantic information of the room and, thus, take a long time to perform thorough disinfection. Human cleaners, on the other hand, disinfect rooms more efficiently by prioritizing the cleaning of high-touch surfaces. To address this gap, we present a novel GPU-based volumetric semantic TSDF (Truncated Signed Distance Function) integration system for semantic 3D reconstruction. Our system produces 3D reconstructions that distinguish high-touch surfaces from non-high-touch surfaces at approximately 50 frames per second on a consumer-grade GPU, which is approximately 5 times faster than existing CPU-based TSDF semantic reconstruction methods. In addition, we extend a UV disinfection motion planning algorithm to incorporate semantic awareness for optimizing coverage of disinfection trajectories. Experiments show that our semantic-aware planning outperforms geometry-only planning by disinfecting up to 20% more high-touch surfaces under the same time budget. Further, the real-time nature of our semantic reconstruction pipeline enables future work on simultaneous disinfection and mapping. Code is available at: RA-SLAM  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Date Published:
Journal Name:
Proceedings of the IEEERSJ International Conference on Intelligent Robots and Systems
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Across a plethora of social situations, we touch others in natural and intuitive ways to share thoughts and emotions, such as tapping to get one’s attention or caressing to soothe one’s anxiety. A deeper understanding of these human-to-human interactions will require, in part, the precise measurement of skin-to-skin physical contact. Among prior efforts, each measurement approach exhibits certain constraints, e.g., motion trackers do not capture the precise shape of skin surfaces, while pressure sensors impede skin-to-skin contact. In contrast, this work develops an interference-free 3D visual tracking system using a depth camera to measure the contact attributes between the bare hand of a toucher and the forearm of a receiver. The toucher’s hand is tracked as a posed and positioned mesh by fitting a hand model to detected 3D hand joints, whereas a receiver’s forearm is extracted as a 3D surface updated upon repeated skin contact. Based on a contact model involving point clouds, the spatiotemporal changes of hand-to-forearm contact are decomposed as six, high-resolution, time-series contact attributes, i.e., contact area, indentation depth, absolute velocity, and three orthogonal velocity components, together with contact duration. To examine the system’s capabilities and limitations, two types of experiments were performed. First, to evaluate its ability to discern human touches, one person delivered cued social messages, e.g., happiness, anger, sympathy, to another person using their preferred gestures. The results indicated that messages and gestures, as well as the identities of the touchers, were readily discerned from their contact attributes. Second, the system’s spatiotemporal accuracy was validated against measurements from independent devices, including an electromagnetic motion tracker, sensorized pressure mat, and laser displacement sensor. While validated here in the context of social communication, this system is extendable to human touch interactions such as maternal care of infants and massage therapy. 
    more » « less
  2. Point cloud computation has become an increasingly more important workload thanks to its applications in autonomous driving. Unlike dense 2D computation, point cloud convolution has sparse and irregular computation patterns and thus requires dedicated inference system support with specialized high-performance kernels. While existing point cloud deep learning libraries have developed different dataflows for convolution on point clouds, they assume a single dataflow throughout the execution of the entire model. In this work, we systematically analyze and improve existing dataflows. Our resulting system, TorchSparse++, achieves 2.9x, 3.3x, 2.2x and 1.8x measured end-to-end speedup on an NVIDIA A100 GPU over the state-of-the-art MinkowskiEngine, SpConv 1.2, TorchSparse and SpConv v2 in inference respectively. Furthermore, TorchSparse++ is the only system to date that supports all necessary primitives for 3D segmentation, detection, and reconstruction workloads in autonomous driving. Code is publicly released at 
    more » « less
  3. Medical steerable needles can follow 3D curvilinear trajectories to avoid anatomical obstacles and reach clinically significant targets inside the human body. Automating steerable needle procedures can enable physicians and patients to harness the full potential of steerable needles by maximally leveraging their steerability to safely and accurately reach targets for medical procedures such as biopsies. For the automation of medical procedures to be clinically accepted, it is critical from a patient care, safety, and regulatory perspective to certify the correctness and effectiveness of the planning algorithms involved in procedure automation. In this paper, we take an important step toward creating a certifiable optimal planner for steerable needles. We present an efficient, resolution-complete motion planner for steerable needles based on a novel adaptation of multi-resolution planning. This is the first motion planner for steerable needles that guarantees to compute in finite time an obstacle-avoiding plan (or notify the user that no such plan exists), under clinically appropriate assumptions. Based on this planner, we then develop the first resolution-optimal motion planner for steerable needles that further provides theoretical guarantees on the quality of the computed motion plan, that is, global optimality, in finite time. Compared to state-of-the-art steerable needle motion planners, we demonstrate with clinically realistic simulations that our planners not only provide theoretical guarantees but also have higher success rates, have lower computation times, and result in higher quality plans. 
    more » « less
  4. Reconstructing 4D vehicular activity (3D space and time) from cameras is useful for autonomous vehicles, commuters and local authorities to plan for smarter and safer cities. Traffic is inherently repetitious over long periods, yet current deep learning-based 3D reconstruction methods have not considered such repetitions and have difficulty generalizing to new intersection-installed cameras. We present a novel approach exploiting longitudinal (long-term) repetitious motion as self-supervision to reconstruct 3D vehicular activity from a video captured by a single fixed camera. Starting from off-the-shelf 2D keypoint detections, our algorithm optimizes 3D vehicle shapes and poses, and then clusters their trajectories in 3D space. The 2D keypoints and trajectory clusters accumulated over long-term are later used to improve the 2D and 3D keypoints via self-supervision without any human annotation. Our method improves reconstruction accuracy over state of the art on scenes with a significant visual difference from the keypoint detector’s training data, and has many applications including velocity estimation, anomaly detection and vehicle counting. We demonstrate results on traffic videos captured at multiple city intersections, collected using our smartphones, YouTube, and other public datasets. 
    more » « less
  5. Steerable needles are capable of accurately targeting difficult-to-reach clinical sites in the body. By bending around sensitive anatomical structures, steerable needles have the potential to reduce the invasiveness of many medical procedures. However, inserting these needles with curved trajectories increases the risk of tissue damage due to perpendicular forces exerted on the surrounding tissue by the needle’s shaft, potentially resulting in lateral shearing through tissue. Such forces can cause significant tissue damage, negatively affecting patient outcomes. In this work, we derive a tissue and needle force model based on a Cosserat string formulation, which describes the normal forces and frictional forces along the shaft as a function of the planned needle path, friction model and parameters, and tip piercing force. We propose this new force model and associated cost function as a safer and more clinically relevant metric than those currently used in motion planning for steerable needles. We fit and validate our model through physical needle robot experiments in a gel phantom. We use this force model to define a bottleneck cost function for motion planning and evaluate it against the commonly used path-length cost function in hundreds of randomly generated three-dimensional (3D) environments. Plans generated with our force-based cost show a 62% reduction in the peak modeled tissue force with only a 0.07% increase in length on average compared to using the path-length cost in planning. Additionally, we demonstrate planning with our force-based cost function in a lung tumor biopsy scenario from a segmented computed tomography (CT) scan. By directly minimizing the modeled needle-to-tissue force, our method may reduce patient risk and improve medical outcomes from steerable needle interventions.

    more » « less