We consider the problem of in-hand dexterous manipulation with a focus on unknown or uncertain hand–object parameters, such as hand configuration, object pose within hand, and contact positions. In particular, in this work we formulate a generic framework for hand–object configuration estimation using underactuated hands as an example. Owing to the passive reconfigurability and the lack of encoders in the hand’s joints, it is challenging to estimate, plan, and actively control underactuated manipulation. By modeling the grasp constraints, we present a particle filter-based framework to estimate the hand configuration. Specifically, given an arbitrary grasp, we start by sampling a set of hand configuration hypotheses and then randomly manipulate the object within the hand. While observing the object’s movements as evidence using an external camera, which is not necessarily calibrated with the hand frame, our estimator calculates the likelihood of each hypothesis to iteratively estimate the hand configuration. Once converged, the estimator is used to track the hand configuration in real time for future manipulations. Thereafter, we develop an algorithm to precisely plan and control the underactuated manipulation to move the grasped object to desired poses. In contrast to most other dexterous manipulation approaches, our framework does not require any tactilemore »
Tactile Behaviors with the Vision-Based Tactile Sensor FingerVision
This paper introduces a vision-based tactile sensor FingerVision, and explores its usefulness in tactile behaviors. FingerVision consists of a transparent elastic skin marked with dots, and a camera that is easy to fabricate, low cost, and physically robust. Unlike other vision-based tactile sensors, the complete transparency of the FingerVision skin provides multimodal sensation. The modalities sensed by FingerVision include distributions of force and slip, and object information such as distance, location, pose, size, shape, and texture. The slip detection is very sensitive since it is obtained by computer vision directly applied to the output from the FingerVision camera. It provides high-resolution slip detection, which does not depend on the contact force, i.e., it can sense slip of a lightweight object that generates negligible contact force. The tactile behaviors explored in this paper include manipulations that utilize this feature. For example, we demonstrate that grasp adaptation with FingerVision can grasp origami, and other deformable and fragile objects such as vegetables, fruits, and raw eggs.
- Award ID(s):
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- International Journal of Humanoid Robotics
- Page Range or eLocation-ID:
- Sponsoring Org:
- National Science Foundation
More Like this
The most common sensing modalities found in a robot perception system are vision and touch, which together can provide global and highly localized data for manipulation. However, these sensing modalities often fail to adequately capture the behavior of target objects during the critical moments as they transition out of static, controlled contact with an end-effector to dynamic and uncontrolled motion. In this work, we present a novel multimodal visuotactile sensor that provides simultaneous visuotactile and proximity depth data. The sensor integrates an RGB camera and air pressure sensor to sense touch with an infrared time-of-flight (ToF) camera to sense proximity by leveraging a selectively transmissive soft membrane to enable the dual sensing modalities. We present the mechanical design, fabrication techniques, algorithm implementations, and evaluation of the sensor's tactile and proximity modalities. The sensor is demonstrated in three open-loop robotic tasks: approaching and contacting an object, catching, and throwing. The fusion of tactile and proximity data could be used to capture key information about a target object's transition behavior for sensor-based control in dynamic manipulation.
Although general purpose robotic manipulators are becoming more capable at manipulating various objects, their ability to manipulate millimeter-scale objects are usually limited. On the other hand, ultrasonic levitation devices have been shown to levitate a large range of small objects, from polystyrene balls to living organisms. By controlling the acoustic force fields, ultrasonic levitation devices can compensate for robot manipulator positioning uncertainty and control the grasping force exerted on the target object. The material agnostic nature of acoustic levitation devices and their ability to dexterously manipulate millimeter-scale objects make them appealing as a grasping mode for general purpose robots. In this work, we present an ultrasonic, contact-less manipulation device that can be attached to or picked up by any general purpose robotic arm, enabling millimeter-scale manipulation with little to no modification to the robot itself. This device is capable of performing the very first phase-controlled picking action on acoustically reflective surfaces. With the manipulator placed around the target object, the manipulator can grasp objects smaller in size than the robot's positioning uncertainty, trap the object to resist air currents during robot movement, and dexterously hold a small and fragile object, like a flower bud. Due to the contact-less nature ofmore »
Robot grasp typically follows five stages: object detection, object localisation, object pose estimation, grasp pose estimation, and grasp planning. We focus on object pose estimation. Our approach relies on three pieces of information: multiple views of the object, the camera’s extrinsic parameters at those viewpoints, and 3D CAD models of objects. The first step involves a standard deep learning backbone (FCN ResNet) to estimate the object label, semantic segmentation, and a coarse estimate of the object pose with respect to the camera. Our novelty is using a refinement module that starts from the coarse pose estimate and refines it by optimisation through differentiable rendering. This is a purely vision-based approach that avoids the need for other information such as point cloud or depth images. We evaluate our object pose estimation approach on the ShapeNet dataset and show improvements over the state of the art. We also show that the estimated object pose results in 99.65% grasp accuracy with the ground truth grasp candidates on the Object Clutter Indoor Dataset (OCID) Grasp dataset, as computed using standard practice.
This paper proposes and evaluates the use of image classification for detailed, full-body human-robot tactile interaction. A camera positioned below a translucent robot skin captures shadows generated from human touch and infers social gestures from the captured images. This approach enables rich tactile interaction with robots without the need for the sensor arrays used in traditional social robot tactile skins. It also supports the use of touch interaction with non-rigid robots, achieves high-resolution sensing for robots with different sizes and shape of surfaces, and removes the requirement of direct contact with the robot. We demonstrate the idea with an inflatable robot and a standing-alone testing device, an algorithm for recognizing touch gestures from shadows that uses Densely Connected Convolutional Networks, and an algorithm for tracking positions of touch and hovering shadows. Our experiments show that the system can distinguish between six touch gestures under three lighting conditions with 87.5 - 96.0% accuracy, depending on the lighting, and can accurately track touch positions as well as infer motion activities in realistic interaction conditions. Additional applications for this method include interactive screens on inflatable robots and privacy-maintaining robots for the home.