An influence diagram is a graphical model of a Bayesian decision problem that is solved by finding a strategy that maximizes expected utility. When an influence diagram is solved by variable elimination or a related dynamic programming algorithm, it is traditional to represent a strategy as a sequence of policies, one for each decision variable, where a policy maps the relevant history for a decision to an action. We propose an alternative representation of a strategy as a graph, called a strategy graph, and show how to modify a variable elimination algorithm so that it constructs a strategy graph. We consider both a classic variable elimination algorithm for influence diagrams and a recent extension of this algorithm that has more relaxed constraints on elimination order that allow improved performance. We consider the advantages of representing a strategy as a graph and, in particular, how to simplify a strategy graph so that it is easier to interpret and analyze.
more »
« less
Heuristic AND/OR Search for Solving Influence Diagram
An influence diagram is a graphical representation of sequential decision-making under uncertainty, defining a structured decision problem by conditional probability functions and additive utility functions over discrete state and action variables. The task of finding the maximum expected utility of influence diagrams is closely related to the cost-optimal probabilistic planning, stochastic programmings, or model-based reinforcement learning. In this position paper, we address the heuristic search for solving influence diagram, where we generate admissible heuristic functions from graph decomposition schemes. Then, we demonstrate how such heuristics can guide an AND/OR branch and bound search. Finally, we briefly discuss the future directions for improving the quality of heuristic functions and search strategies.
more »
« less
- Award ID(s):
- 2008516
- PAR ID:
- 10564840
- Publisher / Repository:
- {AAAI} Press
- Date Published:
- Journal Name:
- Proceedings of the International Symposium on Combinatorial Search
- Volume:
- 11
- Issue:
- 1
- ISSN:
- 2832-9171
- Page Range / eLocation ID:
- 127 to 128
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Influence diagrams provide a modeling and inference framework for sequential decision problems, representing the probabilistic knowledge by a Bayesian network and the preferences of an agent by utility functions over the random variables and decision variables. Computing the maximum expected utility (MEU) and the optimizing policy is exponential in the constrained induced width and therefore is notoriously difficult for larger models. In this paper, we develop a new bounding scheme for MEU that applies partitioning based approximations on top of the decomposition scheme called a multi-operator cluster DAG for influence diagrams that is more sensitive to the underlying structure of the model than the classical join-tree decomposition of influence diagrams. Our bounding scheme utilizes a cost-shifting mechanism to tighten the bound further. We demonstrate the effectiveness of the proposed scheme on various hard benchmarks.more » « less
-
null (Ed.)There are notable examples of online search improving over hand-coded or learned policies (e.g. AlphaZero) for sequential decision making. It is not clear, however, whether or not policy improvement is guaranteed for many of these approaches, even when given a perfect leaf evaluation function and transition model. Indeed, simple counterexamples show that seemingly reasonable online search procedures can hurt performance compared to the original policy. To address this issue, we introduce the choice function framework for analyzing online search procedures for policy improvement. A choice function specifies the actions to be considered at every node of a search tree, with all other actions being pruned. Our main contribution is to give sufficient conditions for stationary and non-stationary choice functions to guarantee that the value achieved by online search is no worse than the original policy. In addition, we describe a general parametric class of choice functions that satisfy those conditions and present an illustrative use case of the empirical utility of the framework.more » « less
-
Influence diagrams (IDs) are graphical models for representing and reasoning with sequential decision-making problems under uncertainty. Limited memory influence diagrams (LIMIDs) model a decision-maker (DM) who forgets the history in the course of making a sequence of decisions. The standard inference task in IDs and LIMIDs is to compute the maximum expected utility (MEU), which is one of the most challenging tasks in graphical models. We present a model decomposition framework in both IDs and LIMIDs, which we call submodel decomposition that generates a tree of single-stage decision problems through a tree clustering scheme. We also develop a valuation algebra over the submodels that leads to a hierarchical message passing algorithm that propagates conditional expected utility functions over a submodel-tree as external messages. We show that the overall complexity is bounded by the maximum tree-width over the submodels, common in graphical model algorithms. Finally, we present a new method for computing upper bounds over a submodel-tree by first exponentiating the utility functions yielding a standard probabilistic graphical model as an upper bound and then applying standard variational upper bounds for the marginal MAP inference, yielding tighter upper bounds compared with state-of-the-art bounding schemes for the MEU task.more » « less
-
null (Ed.)We introduce an iterative framework for solving graph coloring problems using decision diagrams. The decision diagram compactly represents all possible color classes, some of which may contain edge conflicts. In each iteration, we use a constrained minimum network flow model to compute a lower bound and identify conflicts. Infeasible color classes associated with these conflicts are removed by refining the decision diagram. We prove that in the best case, our approach may use exponentially smaller diagrams than exact diagrams for proving optimality. We also develop a primal heuristic based on the decision diagram to find a feasible solution at each iteration. We provide an experimental evaluation on all 137 DIMACS graph coloring instances. Our procedure can solve 52 instances optimally, of which 44 are solved within 1 minute. We also compare our method to a state-of-the-art graph coloring solver based on branch-and-price, and show that we obtain competitive results. Lastly, we report an improved lower bound for the open instance C2000.9.more » « less
An official website of the United States government

