skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Universal Planning Networks
A key challenge in complex visuomotor control is learning abstract representations that are ef- fective for specifying goals, planning, and gen- eralization. To this end, we introduce universal planning networks (UPN). UPNs embed differen- tiable planning within a goal-directed policy. This planning computation unrolls a forward model in a latent space and infers an optimal action plan through gradient descent trajectory optimization. The plan-by-gradient-descent process and its un- derlying representations are learned end-to-end to directly optimize a supervised imitation learning objective. We find that the representations learned are not only effective for goal-directed visual imi- tation via gradient-based trajectory optimization, but can also provide a metric for specifying goals using images. The learned representations can be leveraged to specify distance-based rewards to reach new target states for model-free reinforce- ment learning, resulting in substantially more ef- fective learning when solving new tasks described via image-based goals. We were able to achieve successful transfer of visuomotor planning strate- gies across robots with significantly different mor- phologies and actuation capabilities.  more » « less
Award ID(s):
1734633
PAR ID:
10063835
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
International Conference on Machine Learning (ICML)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Enabling robots to learn tasks and follow instructions as easily as humans is important for many real-world robot applications. Previous approaches have applied machine learning to teach the mapping from language to low dimensional symbolic representations constructed by hand, using demonstration trajectories paired with accompanying instructions. These symbolic methods lead to data efficient learning. Other methods map language directly to high-dimensional control behavior, which requires less design effort but is data-intensive. We propose to first learning symbolic abstractions from demonstration data and then mapping language to those learned abstractions. These symbolic abstractions can be learned with significantly less data than end-to-end approaches, and support partial behavior specification via natural language since they permit planning using traditional planners. During training, our approach requires only a small number of demonstration trajectories paired with natural language—without the use of a simulator—and results in a representation capable of planning to fulfill natural language instructions specifying a goal or partial plan. We apply our approach to two domains, including a mobile manipulator, where a small number of demonstrations enable the robot to follow navigation commands like “Take left at the end of the hallway,” in environments it has not encountered before. 
    more » « less
  2. Enabling robots to learn tasks and follow instructions as easily as humans is important for many real-world robot applications. Previous approaches have applied machine learning to teach the mapping from language to low dimensional symbolic representations constructe by hand, using demonstration trajectories paired with accompanying instructions. These symbolic methods lead to data efficient learning. Other methods map language directly to high-dimensional control behavior, which requires less design effort but is data-intensive. We propose to first learning symbolic abstractions from demonstration data and then mapping language to those learned abstractions. These symbolic abstractions can be learned with significantly less data than end-to-end approaches, and support partial behavior specification via natural language since they permit planning using traditional planners. During training, our approach requires only a small number of demonstration trajectories paired with natural language—without the use of a simulator—and results in a representation capable of planning to fulfill natural language instructions specifying a goal or partial plan. We apply our approach to two domains, including a mobile manipulator, where a small number of demonstrations enable the robot to follow navigation commands like “Take left at the end of the hallway,” in environments it has not encountered before. 
    more » « less
  3. null (Ed.)
    The TEB hierarchical planner for real-time navigation through unknown environments is highly effective at balancing collision avoidance with goal directed motion. Designed over several years and publications, it implements a multi-trajectory optimization based synthesis method for identifying topologically distinct trajectory candidates through navigable space. Unfortunately, the underlying factor graph approach to the optimization problem induces a mismatch between grid-based representations and the optimization graph, which leads to several time and optimization inefficiencies. This paper explores the impact of using egocentric, perception space representations for the local planning map. Doing so alleviates many of the identified issues related to TEB and leads to a new method called egoTEB. Timing experiments and Monte Carlo evaluations in benchmark worlds quantify the benefits of egoTEB for navigation through uncertain environments. 
    more » « less
  4. We consider the problem of sequential robotic manipulation of deformable objects using tools. Previous works have shown that differentiable physics simulators provide gradients to the environment state and help trajectory optimization to converge orders of magnitude faster than model-free reinforcement learning algorithms for deformable object manipulation. However, such gradient-based trajectory optimization typically requires access to the full simulator states and can only solve short-horizon, single-skill tasks due to local optima. In this work, we propose a novel framework, named DiffSkill, that uses a differentiable physics simulator for skill abstraction to solve long-horizon deformable object manipulation tasks from sensory observations. In particular, we first obtain short-horizon skills using individual tools from a gradient-based optimizer, using the full state information in a differentiable simulator; we then learn a neural skill abstractor from the demonstration trajectories which takes RGBD images as input. Finally, we plan over the skills by finding the intermediate goals and then solve long-horizon tasks. We show the advantages of our method in a new set of sequential deformable object manipulation tasks compared to previous reinforcement learning algorithms and compared to the trajectory optimizer. 
    more » « less
  5. This thesis summary presents research focused on incorporating high-level abstract behavioral requirements, called 'conceptual constraints', into the modeling processes of robot Learning from Demonstration (LfD) techniques. This idea is realized via an LfD algorithm called Concept Constrained Learning from Demonstration. This algorithm encodes motion planning constraints as temporally associated logical formulae of Boolean operators that enforce high-level constraints over portions of the robot's motion plan during learned skill execution. This results in more easily trained, more robust, and safer learned skills. Current work focuses on automating constraint discovery, introducing conceptual constraints into human-aware motion planning algorithms, and expanding upon trajectory alignment techniques for LfD. Future work will focus on how concept constrained algorithms and models are best incorporated into effective interfaces for end-users. 
    more » « less