Probabilistic circuits (PCs) such as sumproduct networks efficiently represent large multivariate probability distributions. They are preferred in practice over other probabilistic representations, such as Bayesian and Markov networks, because PCs can solve marginal inference (MAR) tasks in time that scales linearly in the size of the network. Unfortunately, the most probable explanation (MPE) task and its generalization, the marginal maximumaposteriori (MMAP) inference task remain NPhard in these models. Inspired by the recent work on using neural networks for generating nearoptimal solutions to optimization problems such as integer linear programming, we propose an approach that uses neural networks to approximate MMAP inference in PCs. The key idea in our approach is to approximate the cost of an assignment to the query variables using a continuous multilinear function and then use the latter as a loss function. The two main benefits of our new method are that it is selfsupervised, and after the neural network is learned, it requires only linear time to output a solution. We evaluate our new approach on several benchmark datasets and show that it outperforms three competing linear time approximations: maxproduct inference, maxmarginal inference, and sequential estimation, which are used in practice to solve MMAP tasks in PCs.
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

Free, publiclyaccessible full text available March 25, 2025

Recently there has been growing interest in learning probabilistic models that admit polytime inference called tractable probabilistic models from data. Although they generalize poorly as compared to intractable models, they often yield more accurate estimates at prediction time. In this paper, we seek to further explore this tradeoff between generalization performance and inference accuracy by proposing a novel, partially tractable representation called cutset Bayesian networks (CBNs). The main idea in CBNs is to partition the variables into two subsets X and Y, learn a (intractable) Bayesian network that represents P(X) and a tractable conditional model that represents P(YX). The hope is that the intractable model will help improve generalization while the tractable model, by leveraging RaoBlackwellised sampling which combines exact inference and sampling, will help improve the prediction accuracy. To compactly model P(YX), we introduce a novel tractable representation called conditional cutset networks (CCNs) in which all conditional probability distributions are represented using calibrated classifiersâ€”classifiers which typically yield higher quality probability estimates than conventional classifiers. We show via a rigorous experimental evaluation that CBNs and CCNs yield more accurate posterior estimates than their tractable as well as intractable counterparts.

We address the problem of scaling up localsearch or samplingbased inference in Markov logic networks (MLNs) that have large shared substructures but no (or few) tied weights. Such untied MLNs are ubiquitous in practical applications. However, they have very few symmetries, and as a result lifted inference algorithmsthe dominant approach for scaling up inferenceperform poorly on them. The key idea in our approach is to reduce the hard, timeconsuming subtask in sampling algorithms, computing the sum of weights of features that satisfy a full assignment, to the problem of computing a set of partition functions of graphical models, each defined over the logical variables in a firstorder formula. The importance of this reduction is that when the treewidth of all the graphical models is small, it yields an order of magnitude speedup. When the treewidth is large, we propose an oversymmetric approximation and experimentally demonstrate that it is both fast and accurate.more » « less

We consider the problem of computing rth order statistics, namely finding an assignment having rank r in a probabilistic graphical model. We show that the problem is NPhard even when the graphical model has no edges (zerotreewidth models) via a reduction from the partition problem. We use this reduction, specifically a pseudopolynomial time algorithm for number partitioning to yield a pseudopolynomial time approximation algorithm for solving the rth order statistics problem in zero treewidth models. We then extend this algorithm to arbitrary graphical models by generalizing it to tree decompositions, and demonstrate via experimental evaluation on various datasets that our proposed algorithm is more accurate than sampling algorithms.