Recent reinforcement learning (RL) approaches have shown strong performance in complex domains such as Atari games, but are often highly sample inefficient. A common approach to reduce interaction time with the environment is to use reward shaping, which involves carefully designing reward functions that provide the agent intermediate rewards for progress towards the goal. However, designing appropriate shaping rewards is known to be difficult as well as time-consuming. In this work, we address this problem by using natural language instructions to perform reward shaping. We propose the LanguagE-Action Reward Network (LEARN), a framework that maps free-form natural language instructions to intermediate rewards based on actions taken by the agent. These intermediate language-based rewards can seamlessly be integrated into any standard reinforcement learning algorithm. We experiment with Montezuma’s Revenge from the Atari Learning Environment, a popular benchmark in RL. Our experiments on a diverse set of 15 tasks demonstrate that, for the same number of interactions with the environment, language-based rewards lead to successful completion of the task 60 % more often on average, compared to learning without language.
more »
« less
The Role of Tactile Sensing in Learning and Deploying Grasp Refinement Algorithms
A long-standing question in robot hand design is how accurate tactile sensing must be. This paper uses simulated tactile signals and the reinforcement learning (RL) framework to study the sensing needs in grasping systems. Our first experiment investigates the need for rich tactile sensing in the rewards of RL-based grasp refinement algorithms for multi-fingered robotic hands. We systematically integrate different levels of tactile data into the rewards using analytic grasp stability metrics. We find that combining information on contact positions, normals, and forces in the reward yields the highest average success rates of 95.4% for cuboids, 93.1% for cylinders, and 62.3% for spheres across wrist position errors between 0 and 7 centimeters and rotational errors between 0 and 14 degrees. This contact-based reward outperforms a non-tactile binary-reward baseline by 42.9%. Our follow-up experiment shows that when training with tactile-enabled rewards, the use of tactile information in the control policy’s state vector is drastically reducible at only a slight performance decrease of at most 6.6% for no tactile sensing in the state. Since policies do not require access to the reward signal at test time, our work implies that models trained on tactile-enabled hands are deployable to robotic hands with a smaller sensor suite, potentially reducing cost dramatically.
more »
« less
- Award ID(s):
- 1924984
- NSF-PAR ID:
- 10485526
- Publisher / Repository:
- IEEE
- Date Published:
- Journal Name:
- IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- ISSN:
- 2153-0866
- ISBN:
- 978-1-6654-7927-1
- Page Range / eLocation ID:
- 7766 to 7772
- Format(s):
- Medium: X
- Location:
- Kyoto, Japan
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Madden, John D. ; Anderson, Iain A. ; Shea, Herbert R. (Ed.)Current robotic sensing is mainly visual, which is useful up until the point of contact. To understand how an object is being gripped, tactile feedback is needed. Human grasp is gentle yet firm, with integrated tactile touch feedback. Ras Labs makes Synthetic Muscle™, which is a class of electroactive polymer (EAP) based materials and actuators that sense pressure from gentle touch to high impact, controllably contract and expand at low voltage (battery levels), and attenuate force. The development of this technology towards sensing has provided for fingertip-like sensors that were able to detect very light pressures down to 0.01 N and even 0.005 N, with a wide pressure range to 25 N and more and with high linearity. By using these soft yet robust Tactile Fingertip™ sensors, immediate feedback was generated at the first point of contact. Because these elastomeric pads provided a soft compliant interface, the first point of contact did not apply excessive force, allowing for gentle object handling and control of the force applied to the object. The Tactile Fingertip could also detect a change in pressure location on its surface, i.e., directional glide provided real time feedback, making it possible to detect and prevent slippage by then adjusting the grip strength. Machine learning (ML) and artificial intelligence (AI) were integrated into these sensors for object identification along with the determination of good grip (position, grip force, no slip, no wobble) for pick-and-place and other applications. Synthetic Muscle™ is also being retrofitted as actuators into a human hand-like biomimetic gripper. The combination of EAP shape-morphing and sensing promises the potential for robotic grippers with human hand-like control and tactile sensing. This is expected to advance robotics, whether it is for agriculture, medical surgery, therapeutic or personal care, or in extreme environments where humans cannot enter, including with contagions that have no cure, as well as for collaborative robotics to allow humans and robots to intuitively work safely and effectively together.more » « less
-
Madden, John D. ; Anderson, Iain A. ; Shea, Herbert R. (Ed.)Ras Labs makes Synthetic Muscle™, which is a class of electroactive polymer (EAP) based materials and actuators that sense pressure (gentle touch to high impact), controllably contract and expand at low voltage (1.5 V to 50 V, including use of batteries), and attenuate force. We are in the robotics era, but robots do have their challenges. Currently, robotic sensing is mainly visual, which is useful up until the point of contact. To understand how an object is being gripped, tactile feedback is needed. For handling fragile objects, if the grip is too tight, breakage occurs, and if the grip is too loose, the object will slip out of the grasp, also leading to breakage. Rigid robotic grippers using a visual feedback loop can struggle to determine the exact point and quality of contact. Robotic grippers can also get a stuttering effect in the visual feedback loop. By using soft Synthetic Muscle™ based EAP pads as the sensors, immediate feedback was generated at the first point of contact. Because these pads provided a soft, compliant interface, the first point of contact did not apply excessive force, allowing the force applied to the object to be controlled. The EAP sensor could also detect a change in pressure location on its surface, making it possible to detect and prevent slippage by then adjusting the grip strength. In other words, directional glide provided feedback for the presence of possible slippage to then be able to control a slightly tighter grip, without stutter, due to both the feedback and the soft gentleness of the fingertip-like EAP pads themselves. The soft nature of the EAP fingertip pad also naturally held the gripped object, improving the gripping quality over rigid grippers without an increase in applied force. Analogous to finger-like tactile touch, the EAPs with appropriate coatings and electronics were positioned as pressure sensors in the fingertip or end effector regions of robotic grippers. This development of using Synthetic Muscle™ based EAPs as soft sensors provided for sensors that feel like the pads of human fingertips. Basic pressure position and magnitude tests have been successful, with pressure sensitivity down to 0.05 N. Most automation and robots are very strong, very fast, and usually need to be partitioned away from humans for safety reasons. For many repetitive tasks that humans do with delicate or fragile objects, it would be beneficial to use robotics; whether it is for agriculture, medical surgery, therapeutic or personal care, or in extreme environments where humans cannot enter, including with contagions that have no cure. Synthetic Muscle™ was also retrofitted as actuator systems into off-the-shelf robotic grippers and is being considered in novel biomimetic gripper designs, operating at low voltages (less than 50 V). This offers biomimetic movement by contracting like human muscles, but also exceeds natural biological capabilities by expanding under reversed electric polarity. Human grasp is gentle yet firm, with tactile touch feedback. In conjunction with shape-morphing abilities, these EAPs also are being explored to intrinsically sense pressure due to the correlation between mechanical force applied to the EAP and its electronic signature. The robotic field is experiencing phenomenal growth in this fourth phase of the industrial revolution, the robotics era. The combination of Ras Labs’ EAP shape-morphing and sensing features promises the potential for robotic grippers with human hand-like control and tactile sensing. This work is expected to advance both robotics and prosthetics, particularly for collaborative robotics to allow humans and robots to intuitively work safely and effectively together.more » « less
-
null (Ed.)Abstract: Identifying critical decisions is one of the most challenging decision-making problems in real-world applications. In this work, we propose a novel Reinforcement Learning (RL) based Long-Short Term Rewards (LSTR) framework for critical decisions identification. RL is a machine learning area concerned with inducing effective decision-making policies, following which result in the maximum cumulative "reward." Many RL algorithms find the optimal policy via estimating the optimal Q-values, which specify the maximum cumulative reward the agent can receive. In our LSTR framework, the "long term" rewards are defined as "Q-values" and the "short term" rewards are determined by the "reward function." Experiments on a synthetic GridWorld game and real-world Intelligent Tutoring System datasets show that the proposed LSTR framework indeed identifies the critical decisions in the sequences. Furthermore, our results show that carrying out the critical decisions alone is as effective as a fully-executed policy.more » « less
-
This paper introduces a vision-based tactile sensor FingerVision, and explores its usefulness in tactile behaviors. FingerVision consists of a transparent elastic skin marked with dots, and a camera that is easy to fabricate, low cost, and physically robust. Unlike other vision-based tactile sensors, the complete transparency of the FingerVision skin provides multimodal sensation. The modalities sensed by FingerVision include distributions of force and slip, and object information such as distance, location, pose, size, shape, and texture. The slip detection is very sensitive since it is obtained by computer vision directly applied to the output from the FingerVision camera. It provides high-resolution slip detection, which does not depend on the contact force, i.e., it can sense slip of a lightweight object that generates negligible contact force. The tactile behaviors explored in this paper include manipulations that utilize this feature. For example, we demonstrate that grasp adaptation with FingerVision can grasp origami, and other deformable and fragile objects such as vegetables, fruits, and raw eggs.more » « less