During in-hand manipulation, robots must be able to continuously estimate the pose of the object in order to generate appropriate control actions. The performance of algorithms for pose estimation hinges on the robot's sensors being able to detect discriminative geometric object features, but previous sensing modalities are unable to make such measurements robustly. The robot's fingers can occlude the view of environment- or robot-mounted image sensors, and tactile sensors can only measure at the local areas of contact. Motivated by fingertip-embedded proximity sensors' robustness to occlusion and ability to measure beyond the local areas of contact, we present the first evaluation of proximity sensor based pose estimation for in-hand manipulation. We develop a novel two-fingered hand with fingertip-embedded optical time-of-flight proximity sensors as a testbed for pose estimation during planar in-hand manipulation. Here, the in-hand manipulation task consists of the robot moving a cylindrical object from one end of its workspace to the other. We demonstrate, with statistical significance, that proximity-sensor based pose estimation via particle filtering during in-hand manipulation: a) exhibits 50% lower average pose error than a tactile-sensor based baseline; b) empowers a model predictive controller to achieve 30% lower final positioning error compared to when using tactile-sensor based pose estimates. 
                        more » 
                        « less   
                    
                            
                            ShapeMap 3-D: Efficient shape mapping through dense touch and vision
                        
                    
    
            Knowledge of 3-D object shape is of great importance to robot manipulation tasks, but may not be readily available in unstructured environments. While vision is often occluded during robot-object interaction, high-resolution tactile sensors can give a dense local perspective of the object. However, tactile sensors have limited sensing area and the shape representation must faithfully approximate non-contact areas. In addition, a key challenge is efficiently incorporating these dense tactile measurements into a 3-D mapping framework. In this work, we propose an incremental shape mapping method using a GelSight tactile sensor and a depth camera. Local shape is recovered from tactile images via a learned model trained in simulation. Through efficient inference on a spatial factor graph informed by a Gaussian process, we build an implicit surface representation of the object. We demonstrate visuo-tactile mapping in both simulated and real-world experiments, to incrementally build 3-D reconstructions of household objects. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2008279
- PAR ID:
- 10388410
- Date Published:
- Journal Name:
- IEEE International Conference on Robotics and Automation
- Page Range / eLocation ID:
- 7073 to 7080
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            In this work, we propose a novel method to supervise 3D Gaussian Splatting (3DGS) scenes using optical tactile sensors. Optical tactile sensors have become widespread in their use in robotics for manipulation and object representation; however, raw optical tactile sensor data is unsuitable to directly supervise a 3DGS scene. Our representation leverages a Gaussian Process Implicit Surface to implicitly represent the object, combining many touches into a unified representation with uncertainty. We merge this model with a monocular depth estimation network, which is aligned in a two stage process, coarsely aligning with a depth camera and then finely adjusting to match our touch data. For every training image, our method produces a corresponding fused depth and uncertainty map. Utilizing this additional information, we propose a new loss function, variance weighted depth supervised loss, for training the 3DGS scene model. We leverage the DenseTact optical tactile sensor and RealSense RGB-D camera to show that combining touch and vision in this manner leads to quantitatively and qualitatively better results than vision or touch alone in a few-view scene syntheses on opaque as well as on reflective and transparent objects. Please see our project page at armlabstanford.github.io/touch-gsmore » « less
- 
            We introduce Vysics, a vision-and-physics framework for a robot to build an expressive geometry and dynamics model of a single rigid body, using a seconds-long RGBD video and the robot’s proprioception. While the computer vision community has built powerful visual 3D perception algorithms, cluttered environments with heavy occlusions can limit the visibility of objects of interest. However, observed motion of partially occluded objects can imply physical interactions took place, such as contact with a robot or the environment. These inferred contacts can supplement the visible geometry with "physible geometry," which best explains the observed object motion through physics. Vysics uses a vision-based tracking and reconstruction method, BundleSDF, to estimate the trajectory and the visible geometry from an RGBD video, and an odometry-based model learning method, Physics Learning Library (PLL), to infer the "physible" geometry from the trajectory through implicit contact dynamics optimization. The visible and "physible" geometries jointly factor into optimizing a signed distance function (SDF) to represent the object shape. Vysics does not require pretraining, nor tactile or force sensors. Compared with vision-only methods, Vysics yields object models with higher geometric accuracy and better dynamics prediction in experiments where the object interacts with the robot and the environment under heavy occlusion.more » « less
- 
            UniT is an approach to tactile representation learn¬ing, using VQGAN to learn a compact latent space and serve as the tactile representation. It uses tactile images obtained from a single simple object to train the representation with generalizability. This tactile representation can be zero-shot transferred to various downstream tasks, including perception tasks and manipulation policy learning. Our benchmarkings on in-hand 3D pose and 6D pose estimation tasks and a tactile classifcation task show that UniT outperforms existing visual and tactile representation learning methods. Additionally, UniT’s effectiveness in policy learning is demonstrated across three real-world tasks involving diverse manipulated objects and complex robot-object-environment interactions. Through extensive experi¬mentation, UniT is shown to be a simple-to-train, plug-and-play, yet widely effective method for tactile representation learning. For more details, please refer to our open-source repository https://github.com/ZhengtongXu/UniT and the project website https://zhengtongxu.github.io/unit-website/.more » « less
- 
            Developing whole-body tactile skins for robots remains a challenging task, as existing solutions often prioritize modular, one-size-fits-all designs, which, while versatile, fail to account for the robot’s specific shape and the unique demands of its operational context. In this work, we introduce GenTact Toolbox, a computational pipeline for creating versatile wholebody tactile skins tailored to both robot shape and application domain. Our method includes procedural mesh generation for conforming to a robot’s topology, task-driven simulation to refine sensor distribution, and multi-material 3D printing for shape-agnostic fabrication. We validate our approach by creating and deploying six capacitive sensing skins on a Franka Research 3 robot arm in a human-robot interaction scenario. This work represents a shift from “one-size-fits-all” tactile sensors toward context-driven, highly adaptable designs that can be customized for a wide range of robotic systems and applications. The project website is available at https://hiro-group.ronc.one/gentacttoolboxmore » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    