Data-Driven Decision-Theoretic Planning using Recurrent Sum-Product-Max Networks

Tatavarti, Hari; Doshi, Prashant; Hayes, Layton

Citation Details

Sum-product networks (SPN) are knowledge compilation models and are related to other graphical models for efficient probabilistic inference such as arithmetic circuits and AND/OR graphs. Recent investigations into generalizing SPNs have yielded sum-product-max networks (SPMN) which offer a data-driven alternative for decision making that has predominantly relied on handcrafted models. However, SPMNs are not suited for decision-theoretic planning which involves sequential decision making over multiple time steps. In this paper, we present recurrent SPMNs (RSPMN) that learn from and model decision-making data over time. RSPMNs utilize a template network that is unfolded as needed depending on the length of the data sequence. This is significant as RSPMNs not only inherit the benefits of SPNs in being data driven and mostly tractable, they are also well suited for planning problems. We establish soundness conditions on the template network, which guarantee that the resulting SPMN is valid, and present a structure learning algorithm to learn a sound template. RSPMNs learned on a testbed of data sets, some generated using RDDLSim, yield MEUs and policies that are close to the optimal on perfectly-observed domains and easily improve on a recent batch-constrained RL method, which is important because RSPMNs offer a new model-based approach to offline RL. more »

Award ID(s):: 1815598

PAR ID:: 10283611

Author(s) / Creator(s):: Tatavarti, Hari; Doshi, Prashant; Hayes, Layton

Date Published:: 2021-05-17

Journal Name:: Proceedings of the International Conference on Automated Planning and Scheduling

Volume:: 31

Issue:: 1

ISSN:: 2334-0835

Page Range / eLocation ID:: 606-614

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this