Creating a domain model, even for classical, domainindependent planning, is a notoriously hard knowledgeengineering task. A natural approach to solve this problem is to learn a domain model from observations. However, model learning approaches frequently do not provide safety guarantees: the learned model may assume actions are applicable when they are not, and may incorrectly capture actions' effects. This may result in generating plans that will fail when executed. In some domains such failures are not acceptable, due to the cost of failure or inability to replan online after failure. In such settings, all learning must be done offline, based on some observations collected, e.g., by some other agents or a human. Through this learning, the task is to generate a plan that is guaranteed to be successful. This is called the modelfree planning problem. Prior work proposed an algorithm for solving the modelfree planning problem in classical planning. However, they were limited to learning grounded domains, and thus they could not scale. We generalize this prior work and propose the first safe modelfree planning algorithm for lifted domains. We prove the correctness of our approach, and provide a statistical analysis showing that the number of trajectories needed to solve future problems with high probability is linear in the potential size of the domain model. We also present experiments on twelve IPC domains showing that our approach is able to learn the real action model in all cases with at most two trajectories.
more » « less NSFPAR ID:
 10310362
 Date Published:
 Journal Name:
 Proceedings of the 18th International Conference on Principles of Knowledge Representation and Reasoning
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this

Powerful domainindependent planners have been developed to solve various types of planning problems. These planners often require a model of the acting agent's actions, given in some planning domain description language. Yet obtaining such an action model is a notoriously hard task. This task is even more challenging in missioncritical domains, where a trialanderror approach to learning how to act is not an option. In such domains, the action model used to generate plans must be safe, in the sense that plans generated with it must be applicable and achieve their goals. Learning safe action models for planning has been recently explored for domains in which states are sufficiently described with Boolean variables. In this work, we go beyond this limitation and propose the NSAM algorithm. NSAM runs in time that is polynomial in the number of observations and, under certain conditions, is guaranteed to return safe action models. We analyze its worstcase sample complexity, which may be intractable for some domains. Empirically, however, NSAM can quickly learn a safe action model that can solve most problems in the domain.more » « less

Integrating Symbolic Planning and Reinforcement Learning for Following Temporal Logic SpecificationsTeaching a deep reinforcement learning (RL) agent to follow instructions in multitask environments is a challenging problem. We consider that user defines every task by a linear temporal logic (LTL) formula. However, some causal dependencies in complex environments may be unknown to the user in advance. Hence, when human user is specifying instructions, the robot cannot solve the tasks by simply following the given instructions. In this work, we propose a hierarchical reinforcement learning (HRL) framework in which a symbolic transition model is learned to efficiently produce highlevel plans that can guide the agent efficiently solve different tasks. Specifically, the symbolic transition model is learned by inductive logic programming (ILP) to capture logic rules of state transitions. By planning over the product of the symbolic transition model and the automaton derived from the LTL formula, the agent can resolve causal dependencies and break a causally complex problem down into a sequence of simpler lowlevel subtasks. We evaluate the proposed framework on three environments in both discrete and continuous domains, showing advantages over previous representative methods.more » « less

INTRODUCTION Solving quantum manybody problems, such as finding ground states of quantum systems, has farreaching consequences for physics, materials science, and chemistry. Classical computers have facilitated many profound advances in science and technology, but they often struggle to solve such problems. Scalable, faulttolerant quantum computers will be able to solve a broad array of quantum problems but are unlikely to be available for years to come. Meanwhile, how can we best exploit our powerful classical computers to advance our understanding of complex quantum systems? Recently, classical machine learning (ML) techniques have been adapted to investigate problems in quantum manybody physics. So far, these approaches are mostly heuristic, reflecting the general paucity of rigorous theory in ML. Although they have been shown to be effective in some intermediatesize experiments, these methods are generally not backed by convincing theoretical arguments to ensure good performance. RATIONALE A central question is whether classical ML algorithms can provably outperform nonML algorithms in challenging quantum manybody problems. We provide a concrete answer by devising and analyzing classical ML algorithms for predicting the properties of ground states of quantum systems. We prove that these ML algorithms can efficiently and accurately predict groundstate properties of gapped local Hamiltonians, after learning from data obtained by measuring other ground states in the same quantum phase of matter. Furthermore, under a widely accepted complexitytheoretic conjecture, we prove that no efficient classical algorithm that does not learn from data can achieve the same prediction guarantee. By generalizing from experimental data, ML algorithms can solve quantum manybody problems that could not be solved efficiently without access to experimental data. RESULTS We consider a family of gapped local quantum Hamiltonians, where the Hamiltonian H ( x ) depends smoothly on m parameters (denoted by x ). The ML algorithm learns from a set of training data consisting of sampled values of x , each accompanied by a classical representation of the ground state of H ( x ). These training data could be obtained from either classical simulations or quantum experiments. During the prediction phase, the ML algorithm predicts a classical representation of ground states for Hamiltonians different from those in the training data; groundstate properties can then be estimated using the predicted classical representation. Specifically, our classical ML algorithm predicts expectation values of products of local observables in the ground state, with a small error when averaged over the value of x . The run time of the algorithm and the amount of training data required both scale polynomially in m and linearly in the size of the quantum system. Our proof of this result builds on recent developments in quantum information theory, computational learning theory, and condensed matter theory. Furthermore, under the widely accepted conjecture that nondeterministic polynomialtime (NP)â€“complete problems cannot be solved in randomized polynomial time, we prove that no polynomialtime classical algorithm that does not learn from data can match the prediction performance achieved by the ML algorithm. In a related contribution using similar proof techniques, we show that classical ML algorithms can efficiently learn how to classify quantum phases of matter. In this scenario, the training data consist of classical representations of quantum states, where each state carries a label indicating whether it belongs to phase A or phase B . The ML algorithm then predicts the phase label for quantum states that were not encountered during training. The classical ML algorithm not only classifies phases accurately, but also constructs an explicit classifying function. Numerical experiments verify that our proposed ML algorithms work well in a variety of scenarios, including Rydberg atom systems, twodimensional random Heisenberg models, symmetryprotected topological phases, and topologically ordered phases. CONCLUSION We have rigorously established that classical ML algorithms, informed by data collected in physical experiments, can effectively address some quantum manybody problems. These rigorous results boost our hopes that classical ML trained on experimental data can solve practical problems in chemistry and materials science that would be too hard to solve using classical processing alone. Our arguments build on the concept of a succinct classical representation of quantum states derived from randomized Pauli measurements. Although some quantum devices lack the local control needed to perform such measurements, we expect that other classical representations could be exploited by classical ML with similarly powerful results. How can we make use of accessible measurement data to predict properties reliably? Answering such questions will expand the reach of nearterm quantum platforms. Classical algorithms for quantum manybody problems. Classical ML algorithms learn from training data, obtained from either classical simulations or quantum experiments. Then, the ML algorithm produces a classical representation for the ground state of a physical system that was not encountered during training. Classical algorithms that do not learn from data may require substantially longer computation time to achieve the same task.more » « less

Prospection, the act of predicting the consequences of many possible futures, is intrinsic to human planning and action, and may even be at the root of consciousness. Surprisingly, this idea has been explored comparatively little in robotics. In this work, we propose a neural network architecture and associated planning algorithm that (1) learns a representation of the world useful for generating prospective futures after the application of highlevel actions from a large pool of expert demonstrations, (2) uses this generative model to simulate the result of sequences of highlevel actions in a variety of environments, and (3) uses this same representation to evaluate these actions and perform tree search to find a sequence of highlevel actions in a new environment. Models are trained via imitation learning on a variety of domains, including navigation, pickandplace, and a surgical robotics task. Our approach allows us to visualize intermediate motion goals and learn to plan complex activity from visual information.more » « less

Classifying whether collected information related to emerging topics and domains is fake/incorrect is not an easy task because we do not have enough labeled data in the domains. Given labeled data from source domains (e.g., gossip and health) and limited labeled data from a newly emerging target domain (e.g., COVID19 and Ukraine war), simply applying knowledge learned from source domains to the target domain may not work well because of different data distribution. To solve the problem, in this paper, we propose an energybased domain adaptation with active learning for early misinformation detection. Given three real world news datasets, we evaluate our proposed model against two baselines in both domain adaptation and the whole pipeline. Our model outperforms the baselines, improving at least 5% in the domain adaptation task and 10% in the whole pipeline, showing effectiveness of our proposed approach.more » « less