Efficient planning in continuous state and action spaces is
fundamentally hard, even when the transition model is deterministic
and known. One way to alleviate this challenge is to perform bilevel
planning with abstractions, where a highlevel search for abstract
plans is used to guide planning in the original transition
space. Previous work has shown that when state abstractions in the
form of symbolic predicates are hand-designed, operators and samplers
for bilevel planning can be learned from demonstrations. In this work,
we propose an algorithm for learning predicates from demonstrations,
eliminating the need for manually specified state abstractions. Our
key idea is to learn predicates by optimizing a surrogate objective
that is tractable but faithful to our real efficient-planning
objective. We use this surrogate objective in a hill-climbing search
over predicate sets drawn from a grammar. Experimentally, we show
across four robotic planning environments that our learned
abstractions are able to quickly solve held-out tasks, outperforming
six baselines.
more »
« less
Learning Efficient Abstract Planning Models that Choose What to Predict
continuous state and action spaces is bilevel planning, wherein a high- level search over an abstraction of an environment is used to guide low-level decision-making. Recent work has shown how to enable such bilevel planning by learning abstract models in the form of symbolic operators and neural sam- plers. In this work, we show that existing symbolic operator learning approaches fall short in many robotics domains where a robot’s actions tend to cause a large number of irrelevant changes in the abstract state. This is primarily because they attempt to learn operators that exactly predict all observed changes in the abstract state. To overcome this issue, we propose to learn operators that ‘choose what to predict’ by only modelling changes necessary for abstract planning to achieve specified goals. Experimentally, we show that our approach learns operators that lead to efficient planning across 10 different hybrid robotics domains, including 4 from the challenging BEHAVIOR-100 benchmark, while generalizing to novel initial states, goals, and objects.
more »
« less
- Award ID(s):
- 2214177
- PAR ID:
- 10534437
- Publisher / Repository:
- Proceedings of Machine Learning Research: Conference on Robot Learning (CoRL) 2023
- Date Published:
- ISSN:
- 2640-3498
- Format(s):
- Medium: X
- Location:
- Atlanta, GA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Symbolic planning models allow decision-making agents to sequence actions in arbitrary ways to achieve a variety of goals in dynamic domains. However, they are typically handcrafted and tend to require precise formulations that are not robust to human error. Reinforcement learning (RL) approaches do not require such models, and instead learn domain dynamics by exploring the environment and collecting rewards. However, RL approaches tend to require millions of episodes of experience and often learn policies that are not easily transferable to other tasks. In this paper, we address one aspect of the open problem of integrating these approaches: how can decision-making agents resolve discrepancies in their symbolic planning models while attempting to accomplish goals? We propose an integrated framework named SPOTTER that uses RL to augment and support ("spot") a planning agent by discovering new operators needed by the agent to accomplish goals that are initially unreachable for the agent. SPOTTER outperforms pure-RL approaches while also discovering transferable symbolic knowledge and does not require supervision, successful plan traces or any a priori knowledge about the missing planning operator.more » « less
-
null (Ed.)We present a framework for autonomously learning a portable representation that describes a collection of low-level continuous environments. We show that these abstract representations can be learned in a task-independent egocentric space specific to the agent that, when grounded with problem-specific information, are provably sufficient for planning. We demonstrate transfer in two different domains, where an agent learns a portable, task-independent symbolic vocabulary, as well as operators expressed in that vocabulary, and then learns to instantiate those operators on a per-task basis. This reduces the number of samples required to learn a representation of a new task.more » « less
-
Prospection, the act of predicting the consequences of many possible futures, is intrinsic to human planning and action, and may even be at the root of consciousness. Surprisingly, this idea has been explored comparatively little in robotics. In this work, we propose a neural network architecture and associated planning algorithm that (1) learns a representation of the world useful for generating prospective futures after the application of high-level actions from a large pool of expert demonstrations, (2) uses this generative model to simulate the result of sequences of high-level actions in a variety of environments, and (3) uses this same representation to evaluate these actions and perform tree search to find a sequence of high-level actions in a new environment. Models are trained via imitation learning on a variety of domains, including navigation, pick-and-place, and a surgical robotics task. Our approach allows us to visualize intermediate motion goals and learn to plan complex activity from visual information.more » « less
-
Integrating Symbolic Planning and Reinforcement Learning for Following Temporal Logic SpecificationsTeaching a deep reinforcement learning (RL) agent to follow instructions in multi-task environments is a challenging problem. We consider that user defines every task by a linear temporal logic (LTL) formula. However, some causal dependencies in complex environments may be unknown to the user in advance. Hence, when human user is specifying instructions, the robot cannot solve the tasks by simply following the given instructions. In this work, we propose a hierarchical reinforcement learning (HRL) framework in which a symbolic transition model is learned to efficiently produce high-level plans that can guide the agent efficiently solve different tasks. Specifically, the symbolic transition model is learned by inductive logic programming (ILP) to capture logic rules of state transitions. By planning over the product of the symbolic transition model and the automaton derived from the LTL formula, the agent can resolve causal dependencies and break a causally complex problem down into a sequence of simpler low-level sub-tasks. We evaluate the proposed framework on three environments in both discrete and continuous domains, showing advantages over previous representative methods.more » « less