skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Skill Generalization with Verbs
It is imperative that robots can understand natural language commands issued by humans. Such commands typically contain verbs that signify what action should be performed on a given object and that are applicable to many objects. We propose a method for generalizing manipulation skills to novel objects using verbs. Our method learns a probabilistic classifier that determines whether a given object trajectory can be described by a specific verb. We show that this classifier accurately generalizes to novel object categories with an average accuracy of 76.69% across 13 object categories and 14 verbs. We then perform policy search over the object kinematics to find an object trajectory that maximizes classifier prediction for a given verb. Our method allows a robot to generate a trajectory for a novel object based on a verb, which can then be used as input to a motion planner. We show that our model can generate trajectories that are usable for executing five verb commands applied to novel instances of two different object categories on a real robot.  more » « less
Award ID(s):
1955361
PAR ID:
10467322
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Publisher / Repository:
Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Robot manipulation and grasping mechanisms have received considerable attention in the recent past, leading to development of wide-range of industrial applications. This paper proposes the development of an autonomous robotic grasping system for object sorting application. RGB-D data is used by the robot for performing object detection, pose estimation, trajectory generation and object sorting tasks. The proposed approach can also handle grasping on certain objects chosen by users. Trained convolutional neural networks are used to perform object detection and determine the corresponding point cloud cluster of the object to be grasped. From the selected point cloud data, a grasp generator algorithm outputs potential grasps. A grasp filter then scores these potential grasps, and the highest-scored grasp will be chosen to execute on a real robot. A motion planner will generate collision-free trajectories to execute the chosen grasp. The experiments on AUBO robotic manipulator show the potentials of the proposed approach in the context of autonomous object sorting with robust and fast sorting performance. 
    more » « less
  2. null (Ed.)
    As autonomous robots interact and navigate around real-world environments such as homes, it is useful to reliably identify and manipulate articulated objects, such as doors and cabinets. Many prior works in object articulation identification require manipulation of the object, either by the robot or a human. While recent works have addressed predicting articulation types from visual observations alone, they often assume prior knowledge of category-level kinematic motion models or sequence of observations where the articulated parts are moving according to their kinematic constraints. In this work, we propose FormNet, a neural network that identifies the articulation mechanisms between pairs of object parts from a single frame of an RGB-D image and segmentation masks. The network is trained on 100k synthetic images of 149 articulated objects from 6 categories. Synthetic images are rendered via a photorealistic simulator with domain randomization. Our proposed model predicts motion residual flows of object parts, and these flows are used to determine the articulation type and parameters. The network achieves an articulation type classification accuracy of 82.5% on novel object instances in trained categories. Experiments also show how this method enables generalization to novel categories and can be applied to real-world images without fine-tuning. 
    more » « less
  3. null (Ed.)
    Abstract Noun incorporation is commonly thought to avoid the weak compositionality of compounds because it involves conjunction of an argument noun with the incorporating verb. However, it is weakly compositional in two ways. First, the noun’s entity argument needs to be bound or saturated, but previous accounts fail to adequately ensure that it is. Second, non-arguments are often incorporated in many languages, and their thematic role is available for contextual selection. We show that these two weaknesses are actually linked. We focus on the Kiowa language, which generally bars objects from incorporation but allows non-arguments. We show that a mediating relation is required to semantically link the noun to the verb. Absent a relation, the noun’s entity argument is not saturated, and the entire expression is uninterpretable. The mediating relation for non-objects also assigns it a thematic role instead of a postposition. Speakers can choose this role freely, subject to independent constraints from the pragmatics, syntax, and semantics. Objects in Kiowa are in fact allowed to incorporate in certain environments, but we show that these all independently involve a mediating relation. The mediating relation for objects quantifies over the noun and links the noun+verb construction to the rest of the clause. The head that introduces this relation re-categorizes the verb in the syntactic derivation. Essentially, we demonstrate two distinct mechanisms for noun incorporation. Having derived the distribution of Kiowa, we apply the same relations to derive constraints on English complex verbs and synthetic compounds, which exhibit most of the same constraints as Kiowa noun incorporation. We also look at languages with routine object incorporation, and show how the transitivity of the verb depends on whether the v° head introducing the external argument assigns case to the re-categorized verb. 
    more » « less
  4. null (Ed.)
    bot manipulation and grasping mechanisms have received considerable attention in the recent past, leading to development of wide-range of industrial applications. This paper proposes the development of an autonomous robotic grasping system for object sorting application. RGB-D data is used by the robot for performing object detection, pose estimation, trajectory generation and object sorting tasks. The proposed approach can also handle grasping on certain objects chosen by users. Trained convolutional neural networks are used to perform object detection and determine the corresponding point cloud cluster of the object to be grasped. From the selected point cloud data, a grasp generator algorithm outputs potential grasps. A grasp filter then scores these potential grasps, and the highest-scored grasp will be chosen to execute on a real robot. A motion planner will generate collision-free trajectories to execute the chosen grasp. The experiments on AUBO robotic manipulator show the potentials of the proposed approach in the context of autonomous object sorting with robust and fast sorting performance. 
    more » « less
  5. A robot operating in a household environment will see a wide range of unique and unfamiliar objects. While a system could train on many of these, it is infeasible to predict all the objects a robot will see. In this paper, we present a method to generalize object manipulation skills acquired from a limited number of demonstrations, to novel objects from unseen shape categories. Our approach, Local Neural Descriptor Fields (LNDF), utilizes neural descriptors defined on the local geometry of the object to effectively transfer manipulation demonstrations to novel objects at test time. In doing so, we leverage the local geometry shared between objects to produce a more general manipulation framework. We illustrate the efficacy of our approach in manipulating novel objects in novel poses – both in simulation and in the real world. 
    more » « less