Guiding Multi-Step Rearrangement Tasks with Natural Language Instructions

Stengel-Eskin, Elias; Hundt, Andrew; He, Zhuohong; Murali, Aditya; Gopalan, Nakul; Gombolay, Matthew; Hager, Gregory

Citation Details

Enabling human operators to interact with robotic agents using natural language would allow non-experts to intuitively instruct these agents. Towards this goal, we propose a novel Transformer-based model which enables a user to guide a robot arm through a 3D multi-step manipulation task with natural language commands. Our system maps images and commands to masks over grasp or place locations, grounding the language directly in perceptual space. In a suite of block rearrangement tasks, we show that these masks can be combined with an existing manipulation framework without re-training, greatly improving learning efficiency. Our masking model is several orders of magnitude more sample efficient than typical Transformer models, operating with hundreds, not millions, of examples. Our modular design allows us to leverage supervised and reinforcement learning, providing an easy interface for experimentation with different architectures. Our model completes block manipulation tasks with synthetic commands more often than a UNet-based baseline, and learns to localize actions correctly while creating a mapping of symbols to perceptual input that supports compositional reasoning. We provide a valuable resource for 3D manipulation instruction following research by porting an existing 3D block dataset with crowdsourced language to a simulated environment. Our method’s absolute improvement in identifying the correct block on the ported dataset demonstrates its ability to handle syntactic and lexical variation. more »

Award ID(s):: 1763705

NSF-PAR ID:: 10314691

Author(s) / Creator(s):: Stengel-Eskin, Elias; Hundt, Andrew; He, Zhuohong; Murali, Aditya; Gopalan, Nakul; Gombolay, Matthew; Hager, Gregory

Editor(s):: Faust, Aleksandra; Hsu, David; Neumann, Gerhard

Date Published:: 2021-10-01

Journal Name:: Proceedings of Machine Learning Research

Volume:: 164

ISSN:: 2640-3498

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this