We consider the problem of computing rth order statistics, namely finding an assignment having rank r in a probabilistic graphical model. We show that the problem is NPhard even when the graphical model has no edges (zerotreewidth models) via a reduction from the partition problem. We use this reduction, specifically a pseudopolynomial time algorithm for number partitioning to yield a pseudopolynomial time approximation algorithm for solving the rth order statistics problem in zero treewidth models. We then extend this algorithm to arbitrary graphical models by generalizing it to tree decompositions, and demonstrate via experimental evaluation on various datasets that our proposed algorithm is more accurate than sampling algorithms.
 Award ID(s):
 1652835
 NSFPAR ID:
 10051106
 Date Published:
 Journal Name:
 Proceedings of the TwentySixth International Joint Conference on Artificial Intelligence
 Page Range / eLocation ID:
 4617 to 4624
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


Explaining the results of Machine learning algorithms is crucial given the rapid growth and potential applicability of these methods in critical domains including healthcare, defense, autonomous driving, etc. In this paper, we address this problem in the context of Markov Logic Networks (MLNs) which are highly expressive statistical relational models that combine firstorder logic with probabilistic graphical models. MLNs in general are known to be interpretable models, i.e., MLNs can be understood more easily by humans as compared to models learned by approaches such as deep learning. However, at the same time, it is not straightforward to obtain humanunderstandable explanations specific to an observed inference result (e.g. marginal probability estimate). This is because, the MLN provides a lifted interpretation, one that generalizes to all possible worlds/instantiations, which are not query/evidence specific. In this paper, we extract groundedexplanations, i.e., explanations defined w.r.t specific inference queries and observed evidence. We extract these explanations from importance weights defined over the MLN formulas that encode the contribution of formulas towards the final inference results. We validate our approach in real world problems related to analyzing reviews from Yelp, and show through userstudies that our explanations are richer than stateoftheart nonrelational explainers such as LIME .more » « less

Statistical relational learning models are powerful tools that combine ideas from firstorder logic with probabilistic graphical models to represent complex dependencies. Despite their success in encoding large problems with a compact set of weighted rules, performing inference over these models is often challenging. In this paper, we show how to effectively combine two powerful ideas for scaling inference for large graphical models. The first idea, lifted inference, is a wellstudied approach to speeding up inference in graphical models by exploiting symmetries in the underlying problem. The second idea is to frame Maximum a posteriori (MAP) inference as a convex optimization problem and use alternating direction method of multipliers (ADMM) to solve the problem in parallel. A wellstudied relaxation to the combinatorial optimization problem defined for logical Markov random fields gives rise to a hingeloss Markov random field (HLMRF) for which MAP inference is a convex optimization problem. We show how the formalism introduced for coloring weighted bipartite graphs using a color refinement algorithm can be integrated with the ADMM optimization technique to take advantage of the sparse dependency structures of HLMRFs. Our proposed approach, lifted hingeloss Markov random fields (LHLMRFs), preserves the structure of the original problem after lifting and solves lifted inference as distributed convex optimization with ADMM. In our empirical evaluation on realworld problems, we observe up to a three times speed up in inference over HLMRFs.more » « less

Graphical models have witnessed significant growth and usage in spatial data science for modeling data referenced over a massive number of spatialtemporal coordinates. Much of this literature has focused on a single or relatively few spatially dependent outcomes. Recent attention has focused upon addressing modeling and inference for substantially large number of outcomes. While spatial factor models and multivariate basis expansions occupy a prominent place in this domain, this article elucidates a recent approach, graphical Gaussian Processes, that exploits the notion of conditional independence among a very large number of spatial processes to build scalable graphical models for fully modelbased Bayesian analysis of multivariate spatial data.more » « less

We study the problem of designing scalable algorithms to find effective intervention strategies for controlling stochastic epidemic processes on networks. This is a common problem arising in agent based models for epidemic spread.Previous approaches to this problem focus on either heuristics with no guarantees or approximation algorithms that scale only to networks corresponding to countysized populations, typically, with less than a million nodes. In particular, the mathematicalprogramming based approaches need to solve the Linear Program (LP) relaxation of the problem using an LP solver, which restricts the scalability of this approach. In this work, we overcome this restriction by designing an algorithm that adapts the multiplicative weights update (MWU) framework, along with the sample average approximation (SAA) technique, to approximately solve the linear program (LP) relaxation for the problem. To scale this approach further, we provide a memoryefficient algorithm that enables scaling to large networks, corresponding to countrysize populations, with over 300 million nodes and 30 billion edges. Furthermore, we show that this approach provides nearoptimal solutions to the LP in practice.