Energy-based Models are Zero-Shot Planners for Compositional Scene Rearrangement

Gkanatsios, Nikolaos; Jain, Ayush; Xian, Zhou; Zhang, Yunchu; Atkeson, Christopher; Fragkiadaki, Katerina

doi:10.15607/RSS.2023.XIX.030

Citation Details

This content will become publicly available on July 10, 2024

Energy-based Models are Zero-Shot Planners for Compositional Scene Rearrangement

Language is compositional; an instruction can ex- press multiple relation constraints to hold among objects in a scene that a robot is tasked to rearrange. Our focus in this work is an instructable scene-rearranging framework that gen- eralizes to longer instructions and to spatial concept compositions never seen at training time. We propose to represent language- instructed spatial concepts with energy functions over relative object arrangements. A language parser maps instructions to corresponding energy functions and an open-vocabulary visual- language model grounds their arguments to relevant objects in the scene. We generate goal scene configurations by gradient descent on the sum of energy functions, one per language predi- cate in the instruction. Local vision-based policies then re-locate objects to the inferred goal locations. We test our model on es- tablished instruction-guided manipulation benchmarks, as well as benchmarks of compositional instructions we introduce. We show our model can execute highly compositional instructions zero-shot in simulation and in the real world. It outperforms language- to-action reactive policies and Large Language Model planners by a large margin, especially for long instructions that involve compositions of multiple spatial concepts. Simulation and real- world robot execution videos, as well as our code and datasets are publicly available on our website: https://ebmplanner.github.io. more »

Award ID(s):: 1849287

NSF-PAR ID:: 10496019

Author(s) / Creator(s):: Gkanatsios, Nikolaos; Jain, Ayush; Xian, Zhou; Zhang, Yunchu; Atkeson, Christopher; Fragkiadaki, Katerina

Publisher / Repository:: Robotics: Science and Systems Foundation

Date Published:: 2023-07-10

Journal Name:: Robotics: Science and Systems 2023

ISSN:: 2330-7668

Format(s):: Medium: X

Location:: Daegu, Republic of Korea

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on July 10, 2024
Conference Paper:
https://doi.org/10.15607/RSS.2023.XIX.030

More Like this