skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Safe Learning of PDDL Domains with Conditional Effects
Powerful domain-independent planners have been developed to solve various types of planning problems. These planners often require a model of the acting agent's actions, given in some planning domain description language. Manually designing such an action model is a notoriously challenging task. An alternative is to automatically learn action models from observation. Such an action model is called safe if every plan created with it is consistent with the real, unknown action model. Algorithms for learning such safe action models exist, yet they cannot handle domains with conditional or universal effects, which are common constructs in many planning problems. We prove that learning non-trivial safe action models with conditional effects may require an exponential number of samples. Then, we identify reasonable assumptions under which such learning is tractable and propose Conditional-SAM, the first algorithm capable of doing so. We analyze Conditional-SAM theoretically and evaluate it experimentally. Our results show that the action models learned by Conditional-SAM can be used to solve perfectly most of the test set problems in most of the experimented domains.  more » « less
Award ID(s):
1942336 1939677 1908287
PAR ID:
10544592
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
AAAI
Date Published:
Journal Name:
Proceedings of the International Conference on Automated Planning and Scheduling
Volume:
34
ISSN:
2334-0835
Page Range / eLocation ID:
387 to 395
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Powerful domain-independent planners have been developed to solve various types of planning problems. These planners often require a model of the acting agent's actions, given in some planning domain description language. Yet obtaining such an action model is a notoriously hard task. This task is even more challenging in mission-critical domains, where a trial-and-error approach to learning how to act is not an option. In such domains, the action model used to generate plans must be safe, in the sense that plans generated with it must be applicable and achieve their goals. Learning safe action models for planning has been recently explored for domains in which states are sufficiently described with Boolean variables. In this work, we go beyond this limitation and propose the NSAM algorithm. NSAM runs in time that is polynomial in the number of observations and, under certain conditions, is guaranteed to return safe action models. We analyze its worst-case sample complexity, which may be intractable for some domains. Empirically, however, NSAM can quickly learn a safe action model that can solve most problems in the domain. 
    more » « less
  2. Creating a domain model, even for classical, domain-independent planning, is a notoriously hard knowledge-engineering task. A natural approach to solve this problem is to learn a domain model from observations. However, model learning approaches frequently do not provide safety guarantees: the learned model may assume actions are applicable when they are not, and may incorrectly capture actions' effects. This may result in generating plans that will fail when executed. In some domains such failures are not acceptable, due to the cost of failure or inability to replan online after failure. In such settings, all learning must be done offline, based on some observations collected, e.g., by some other agents or a human. Through this learning, the task is to generate a plan that is guaranteed to be successful. This is called the model-free planning problem. Prior work proposed an algorithm for solving the model-free planning problem in classical planning. However, they were limited to learning grounded domains, and thus they could not scale. We generalize this prior work and propose the first safe model-free planning algorithm for lifted domains. We prove the correctness of our approach, and provide a statistical analysis showing that the number of trajectories needed to solve future problems with high probability is linear in the potential size of the domain model. We also present experiments on twelve IPC domains showing that our approach is able to learn the real action model in all cases with at most two trajectories. 
    more » « less
  3. We consider the problem of learning action models for planning in unknown stochastic environments that can be defined using the Probabilistic Planning Domain Description Language (PPDDL). As input, we are given a set of previously executed trajectories, and the main challenge is to learn an action model that has a similar goal achievement probability to the policies used to create these trajectories. To this end, we introduce a variant of PPDDL in which there is uncertainty about the transition probabilities, specified by an interval for each factor that contains the respective true transition probabilities. Then, we present SAM+, an algorithm that learns such an imprecise-PPDDL environment model. SAM+ has a polynomial time and sample complexity, and guarantees that with high probability, the true environment is indeed captured by the defined intervals. We prove that the action model SAM+ outputs has a goal achievement probability that is almost as good or better than that of the policies used to produced the training trajectories. Then, we show how to produce a PPDDL model based on this imprecise-PPDDL model that has similar properties. 
    more » « less
  4. Robotic manipulation problems are inherently continuous, but typically have underlying discrete structure, e.g., whether or not an object is grasped. This means many problems are multi-modal and in particular have a continuous infinity of modes. For example, in a pick-and-place manipulation domain, every grasp and placement of an object is a mode. Usually manipulation problems require the robot to transition into different modes, e.g., going from a mode with an object placed to another mode with the object grasped. To successfully find a manipulation plan, a planner must find a sequence of valid single-mode motions as well as valid transitions between these modes. Many manipulation planners have been proposed to solve tasks with multi-modal structure. However, these methods require mode-specific planners and fail to scale to very cluttered environments or to tasks that require long sequences of transitions. This paper presents a general layered planning approach to multi-modal planning that uses a discrete “lead” to bias search towards useful mode transitions. The difficulty of achieving specific mode transitions is captured online and used to bias search towards more promising sequences of modes. We demonstrate our planner on complex scenes and show that significant performance improvements are tied to both our discrete “lead” and our continuous representation. 
    more » « less
  5. Most social challenges fall outside of the authority of any single individual and therefore require collective action—coordinated efforts by many stakeholders to implement solutions. Despite growing interest in teaching students to lead collective action, we lack models for how to teach these skills. Collective action ostensibly involves design: the act of planning to change existing situations into preferred ones. In other domains, instructors commonly scaffold design using an instructional model known as studio critique in which students strengthen their plans by exchanging arguments with peers and instructors. This study explores whether studio critique can serve as the basis for an effective instructional model in collective action. Using design-based research methods, we designed and implemented scoping deliberations, a new instructional model that augments studio critique with domain-specific templates for planning collective action and repeats weekly to enable iterations. We used process tracing to analyze data from field notes, video, and artifacts to evaluate causal explanations for events observed in this case study. By implementing scoping deliberations in a 10-week undergraduate course, we found that this model appeared effective at scaffolding engagement in planning collective action: students articulated and refined their plans by engaging in argumentation and iteration, as expected. However, students struggled to contact the community stakeholders with whom they planned to work. As a result, their plans rested on implausible, untested assertions. These findings advance instructional science by showing that collective action may require new instructional models that help students to test their assertions against feedback from community stakeholders. Practically, scoping deliberations appear most useful for scaffolding thoughtful planning in conditions when students are already collaborating with stakeholders. 
    more » « less