skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Learning Generalizable Tool-use Skills through Trajectory Generation
Autonomous systems that efficiently utilize tools can assist humans in completing many common tasks such as cooking and cleaning. However, current systems fall short of matching human-level of intelligence in terms of adapting to novel tools. Prior works based on affordance often make strong assumptions about the environments and cannot scale to more complex, contact-rich tasks. In this work, we tackle this challenge and explore how agents can learn to use previously unseen tools to manipulate deformable objects. We propose to learn a generative model of the tool-use trajectories as a sequence of tool point clouds, which generalizes to different tool shapes. Given any novel tool, we first generate a tool-use trajectory and then optimize the sequence of tool poses to align with the generated trajectory. We train a single model on four different challenging deformable object manipulation tasks, using demonstration data from only one tool per task. The model generalizes to various novel tools, significantly outperforming baselines. We further test our trained policy in the real world with unseen tools, where it achieves the performance comparable to human.  more » « less
Award ID(s):
2046491
PAR ID:
10573305
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
arxiv.org
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider the problem of sequential robotic manipulation of deformable objects using tools. Previous works have shown that differentiable physics simulators provide gradients to the environment state and help trajectory optimization to converge orders of magnitude faster than model-free reinforcement learning algorithms for deformable object manipulation. However, such gradient-based trajectory optimization typically requires access to the full simulator states and can only solve short-horizon, single-skill tasks due to local optima. In this work, we propose a novel framework, named DiffSkill, that uses a differentiable physics simulator for skill abstraction to solve long-horizon deformable object manipulation tasks from sensory observations. In particular, we first obtain short-horizon skills using individual tools from a gradient-based optimizer, using the full state information in a differentiable simulator; we then learn a neural skill abstractor from the demonstration trajectories which takes RGBD images as input. Finally, we plan over the skills by finding the intermediate goals and then solve long-horizon tasks. We show the advantages of our method in a new set of sequential deformable object manipulation tasks compared to previous reinforcement learning algorithms and compared to the trajectory optimizer. 
    more » « less
  2. The task of “relative placement” is to predict the placement of one object in relation to another, e.g. placing a mug onto a mug rack. Through explicit object-centric geometric reasoning, recent methods for relative placement have made tremendous progress towards data-efficient learning for robot manipulation while generalizing to unseen task variations. However, they have yet to represent deformable transformations, despite the ubiquity of non-rigid bodies in real world settings. As a first step towards bridging this gap, we propose “cross-displacement” - an extension of the principles of relative placement to geometric relationships between deformable objects - and present a novel vision-based method to learn cross-displacement through dense diffusion. To this end, we demonstrate our method’s ability to generalize to unseen object instances, out- of-distribution scene configurations, and multimodal goals on multiple highly deformable tasks (both in simulation and in the real world) beyond the scope of prior works. 
    more » « less
  3. This project introduces a framework to enable robots to recognize human hand signals, a reliable and feasible device-free means of communication in many noisy environments such as construction sites and airport ramps, to facilitate efficient human-robot collaboration. Various hand signal systems are accepted in many small groups for specific purposes, such as Marshalling on airport ramps and construction site crane operations. Robots must be robust to unpredictable conditions, including various backgrounds and human appearances, an extreme challenge imposed by open environments. To address these challenges, we propose Instant Hand Signal Recognition (IHSR), a learning-based framework with world knowledge of human gestures embedded, for robots to learn novel hand signals in a few samples. It also offers robust zero-shot generalization to recognize learned signals in novel scenarios. Extensive experiments show that our IHSR can learn a novel hand signal in only 50 samples, which is 30+ times more efficient than the state-of-the-art method. It also demonstrates a robust zero-shot generalization for deploying a learned model in unseen environments to recognize hand signals from unseen human users. 
    more » « less
  4. Robot-to-human handovers are common exercises in many robotics application domains. The requirements of handovers may vary across these different domains. In this paper, we first devised a taxonomy to organize the diverse and sometimes contradictory requirements. Among these, task-oriented handovers are not well-studied but important because the purpose of the handovers in human-robot collaboration (HRC) is not merely to pass an object from a robot to a human receiver, but to enable the receiver to use it in a subsequent tool-use task. A successful task-oriented handover should incorporate task-related information - orienting the tool such that the human can grasp it in a way that is suitable for the task. We identified multiple difficulty levels of task-oriented handovers, and implemented a system to generate handovers with novel tools on a physical robot. Unlike previous studies on task-oriented handovers, we trained the robot with tool-use demonstrations rather than handover demonstrations, since task-oriented handovers are dependent on the tool usages in the subsequent task. We demonstrated that our method can adapt to all difficulty levels of task-oriented handovers, including tasks that matched the typical usage of the tool, tasks that required an improvised or unusual usage of the tool, and tasks where the handover was adapted to the pose of a manipulandum. 
    more » « less
  5. Children’s automatic speech recognition (ASR) is always difficult due to, in part, the data scarcity problem, especially for kindergarten-aged kids. When data are scarce, the model might overfit to the training data, and hence good starting points for training are essential. Recently, meta-learning was proposed to learn model initialization (MI) for ASR tasks of different languages. This method leads to good performance when the model is adapted to an unseen language. How-ever, MI is vulnerable to overfitting on training tasks (learner overfitting). It is also unknown whether MI generalizes to other low-resource tasks. In this paper, we validate the effectiveness of MI in children’s ASR and attempt to alleviate the problem of learner overfitting. To achieve model-agnostic meta-learning (MAML), we regard children’s speech at each age as a different task. In terms of learner overfitting, we propose a task-level augmentation method by simulating new ages using frequency warping techniques. Detailed experiments are conducted to show the impact of task augmentation on each age for kindergarten-aged speech. As a result, our approach achieves a relative word error rate (WER) improvement of 51% over the baseline system with no augmentation or initialization. 
    more » « less