During in-hand manipulation, robots must be able to continuously estimate the pose of the object in order to generate appropriate control actions. The performance of algorithms for pose estimation hinges on the robot's sensors being able to detect discriminative geometric object features, but previous sensing modalities are unable to make such measurements robustly. The robot's fingers can occlude the view of environment- or robot-mounted image sensors, and tactile sensors can only measure at the local areas of contact. Motivated by fingertip-embedded proximity sensors' robustness to occlusion and ability to measure beyond the local areas of contact, we present the first evaluation of proximity sensor based pose estimation for in-hand manipulation. We develop a novel two-fingered hand with fingertip-embedded optical time-of-flight proximity sensors as a testbed for pose estimation during planar in-hand manipulation. Here, the in-hand manipulation task consists of the robot moving a cylindrical object from one end of its workspace to the other. We demonstrate, with statistical significance, that proximity-sensor based pose estimation via particle filtering during in-hand manipulation: a) exhibits 50% lower average pose error than a tactile-sensor based baseline; b) empowers a model predictive controller to achieve 30% lower final positioning error compared to when using tactile-sensor based pose estimates. 
                        more » 
                        « less   
                    
                            
                            Multimodal Proximity and Visuotactile Sensing With a Selectively Transmissive Soft Membrane
                        
                    
    
            The most common sensing modalities found in a robot perception system are vision and touch, which together can provide global and highly localized data for manipulation. However, these sensing modalities often fail to adequately capture the behavior of target objects during the critical moments as they transition out of static, controlled contact with an end-effector to dynamic and uncontrolled motion. In this work, we present a novel multimodal visuotactile sensor that provides simultaneous visuotactile and proximity depth data. The sensor integrates an RGB camera and air pressure sensor to sense touch with an infrared time-of-flight (ToF) camera to sense proximity by leveraging a selectively transmissive soft membrane to enable the dual sensing modalities. We present the mechanical design, fabrication techniques, algorithm implementations, and evaluation of the sensor's tactile and proximity modalities. The sensor is demonstrated in three open-loop robotic tasks: approaching and contacting an object, catching, and throwing. The fusion of tactile and proximity data could be used to capture key information about a target object's transition behavior for sensor-based control in dynamic manipulation. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1935294
- PAR ID:
- 10379073
- Date Published:
- Journal Name:
- 022 IEEE 5th International Conference on Soft Robotics (RoboSoft)
- Page Range / eLocation ID:
- 802 to 808
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            In this work, we propose a novel method to supervise 3D Gaussian Splatting (3DGS) scenes using optical tactile sensors. Optical tactile sensors have become widespread in their use in robotics for manipulation and object representation; however, raw optical tactile sensor data is unsuitable to directly supervise a 3DGS scene. Our representation leverages a Gaussian Process Implicit Surface to implicitly represent the object, combining many touches into a unified representation with uncertainty. We merge this model with a monocular depth estimation network, which is aligned in a two stage process, coarsely aligning with a depth camera and then finely adjusting to match our touch data. For every training image, our method produces a corresponding fused depth and uncertainty map. Utilizing this additional information, we propose a new loss function, variance weighted depth supervised loss, for training the 3DGS scene model. We leverage the DenseTact optical tactile sensor and RealSense RGB-D camera to show that combining touch and vision in this manner leads to quantitatively and qualitatively better results than vision or touch alone in a few-view scene syntheses on opaque as well as on reflective and transparent objects. Please see our project page at armlabstanford.github.io/touch-gsmore » « less
- 
            We describe a single fingertip-mounted sensing system for robot manipulation that provides proximity (pre-touch), contact detection (touch), and force sensing (post-touch). The sensor system consists of optical time-of-flight range measurement modules covered in a clear elastomer. Because the elastomer is clear, the sensor can detect and range nearby objects, as well as measure deformations caused by objects that are in contact with the sensor and thereby estimate the applied force. We examine how this sensor design can be improved with respect to invariance to object reflectivity, signal-to-noise ratio, and continuous operation when switching between the distance and force measurement regimes. By harnessing time-of-flight technology and optimizing the elastomer-air boundary to control the emitted light's path, we develop a sensor that is able to seamlessly transition between measuring distances of up to 50 mm and contact forces of up to 10 newtons. We demonstrate that our sensor improves manipulation accuracy in a block unstacking task. Thorough instructions for manufacturing the sensor from inexpensive, commercially available components are provided, as well as all relevant hardware design files and software sources.more » « less
- 
            null (Ed.)This paper proposes and evaluates the use of image classification for detailed, full-body human-robot tactile interaction. A camera positioned below a translucent robot skin captures shadows generated from human touch and infers social gestures from the captured images. This approach enables rich tactile interaction with robots without the need for the sensor arrays used in traditional social robot tactile skins. It also supports the use of touch interaction with non-rigid robots, achieves high-resolution sensing for robots with different sizes and shape of surfaces, and removes the requirement of direct contact with the robot. We demonstrate the idea with an inflatable robot and a standing-alone testing device, an algorithm for recognizing touch gestures from shadows that uses Densely Connected Convolutional Networks, and an algorithm for tracking positions of touch and hovering shadows. Our experiments show that the system can distinguish between six touch gestures under three lighting conditions with 87.5 - 96.0% accuracy, depending on the lighting, and can accurately track touch positions as well as infer motion activities in realistic interaction conditions. Additional applications for this method include interactive screens on inflatable robots and privacy-maintaining robots for the home.more » « less
- 
            Multi-sensor fusion has been widely used by autonomous vehicles (AVs) to integrate the perception results from different sensing modalities including LiDAR, camera and radar. Despite the rapid development of multi-sensor fusion systems in autonomous driving, their vulnerability to malicious attacks have not been well studied. Although some prior works have studied the attacks against the perception systems of AVs, they only consider a single sensing modality or a camera-LiDAR fusion system, which can not attack the sensor fusion system based on LiDAR, camera, and radar. To fill this research gap, in this paper, we present the first study on the vulnerability of multi-sensor fusion systems that employ LiDAR, camera, and radar. Specifically, we propose a novel attack method that can simultaneously attack all three types of sensing modalities using a single type of adversarial object. The adversarial object can be easily fabricated at low cost, and the proposed attack can be easily performed with high stealthiness and flexibility in practice. Extensive experiments based on a real-world AV testbed show that the proposed attack can continuously hide a target vehicle from the perception system of a victim AV using only two small adversarial objects.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    