NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Learning Probably Approximately Complete and Safe Action Models for Stochastic Worlds

https://doi.org/10.1609/aaai.v36i9.21215

Juba, Brendan; Stern, Roni (June 2022, Proceedings of the AAAI Conference on Artificial Intelligence)

We consider the problem of learning action models for planning in unknown stochastic environments that can be defined using the Probabilistic Planning Domain Description Language (PPDDL). As input, we are given a set of previously executed trajectories, and the main challenge is to learn an action model that has a similar goal achievement probability to the policies used to create these trajectories. To this end, we introduce a variant of PPDDL in which there is uncertainty about the transition probabilities, specified by an interval for each factor that contains the respective true transition probabilities. Then, we present SAM+, an algorithm that learns such an imprecise-PPDDL environment model. SAM+ has a polynomial time and sample complexity, and guarantees that with high probability, the true environment is indeed captured by the defined intervals. We prove that the action model SAM+ outputs has a goal achievement probability that is almost as good or better than that of the policies used to produced the training trajectories. Then, we show how to produce a PPDDL model based on this imprecise-PPDDL model that has similar properties.
more » « less
Full Text Available
Polynomial Time Reinforcement Learning in Factored State MDPs with Linear Value Functions

Deng, Zihao; Devic, Siddartha; Juba, Brendan (January 2022, Proceedings of Machine Learning Research)

Full Text Available
Conditional Linear Regression for Heterogeneous Covariances

Liang, Leda; Juba, Brendan (January 2022, Proceedings of Machine Learning Research)

Full Text Available
Hardness of Maximum Likelihood Learning of DPPs

Grigorescu, Elena; Juba, Brendan; Wimmer, Karl; Xie, Ning (January 2022, Proceedings of Machine Learning Research)

Full Text Available
Safe Learning of Lifted Action Models

https://doi.org/10.24963/kr.2021/36

Juba, Brendan; Le, Hai S.; Stern, Roni (September 2021, Proceedings of the 18th International Conference on Principles of Knowledge Representation and Reasoning)

Creating a domain model, even for classical, domain-independent planning, is a notoriously hard knowledge-engineering task. A natural approach to solve this problem is to learn a domain model from observations. However, model learning approaches frequently do not provide safety guarantees: the learned model may assume actions are applicable when they are not, and may incorrectly capture actions' effects. This may result in generating plans that will fail when executed. In some domains such failures are not acceptable, due to the cost of failure or inability to replan online after failure. In such settings, all learning must be done offline, based on some observations collected, e.g., by some other agents or a human. Through this learning, the task is to generate a plan that is guaranteed to be successful. This is called the model-free planning problem. Prior work proposed an algorithm for solving the model-free planning problem in classical planning. However, they were limited to learning grounded domains, and thus they could not scale. We generalize this prior work and propose the first safe model-free planning algorithm for lifted domains. We prove the correctness of our approach, and provide a statistical analysis showing that the number of trajectories needed to solve future problems with high probability is linear in the potential size of the domain model. We also present experiments on twelve IPC domains showing that our approach is able to learn the real action model in all cases with at most two trajectories.
more » « less
Full Text Available
Probabilistic Generating Circuits

Zhang, Honghua; Juba, Brendan; Van Den Broeck, Guy (January 2021, Proceedings of the 38th International Conference on Machine Learning)
null (Ed.)
Full Text Available
One network fits all? Modular versus monolithic task formulations in neural networks

Agarwala, Atish; Das, Abhimanyu; Juba, Brendan; Panigrahy, Rina; Sharan, Vatsal; Wang, Xin; Zhang, Qiuyi (January 2021, Proceedings of the Ninth International Conference on Learning Representations (ICLR 2021))
null (Ed.)
Full Text Available
List Learning with Attribute Noise

Cheraghchi, Mahdi; Grigorescu, Elena; Juba, Brendan; Wimmer, Karl; Xie, Ning (January 2021, Proceedings of The 24th International Conference on Artificial Intelligence and Statistics)
null (Ed.)
Full Text Available
A Tensor Decomposition Method for Unsupervised Feature Learning on Satellite Imagery

https://doi.org/10.1109/IGARSS39084.2020.9324715

Dehghanpoor, Golnoosh; Frachetti, Michael; Juba, Brendan (September 2020, IGARSS 2020 - 2020 IEEE International Geoscience and Remote Sensing Symposium)

Full Text Available
Conditional Linear Regression

Calderon, Diego; Juba, Brendan; Li, Sirui; Li, Zongyi; Ruan, Lisa (August 2020, Proceedings of Machine Learning Research)

Work in machine learning and statistics commonly focuses on building models that capture the vast majority of data, possibly ignoring a segment of the population as outliers. However, there may not exist a good, simple model for the distribution, so we seek to find a small subset where there exists such a model. We give a computationally efficient algorithm with theoretical analysis for the conditional linear regression task, which is the joint task of identifying a significant portion of the data distribution, described by a k-DNF, along with a linear predictor on that portion with a small loss. In contrast to work in robust statistics on small subsets, our loss bounds do not feature a dependence on the density of the portion we fit, and compared to previous work on conditional linear regression, our algorithm’s running time scales polynomially with the sparsity of the linear predictor. We also demonstrate empirically that our algorithm can leverage this advantage to obtain a k-DNF with a better linear predictor in practice.
more » « less
Full Text Available

« Prev Next »

Search for: All records