 Award ID(s):
 1652835
 NSFPAR ID:
 10051106
 Date Published:
 Journal Name:
 Proceedings of the TwentySixth International Joint Conference on Artificial Intelligence
 Page Range / eLocation ID:
 4617 to 4624
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this

Explaining the results of Machine learning algorithms is crucial given the rapid growth and potential applicability of these methods in critical domains including healthcare, defense, autonomous driving, etc. In this paper, we address this problem in the context of Markov Logic Networks (MLNs) which are highly expressive statistical relational models that combine firstorder logic with probabilistic graphical models. MLNs in general are known to be interpretable models, i.e., MLNs can be understood more easily by humans as compared to models learned by approaches such as deep learning. However, at the same time, it is not straightforward to obtain humanunderstandable explanations specific to an observed inference result (e.g. marginal probability estimate). This is because, the MLN provides a lifted interpretation, one that generalizes to all possible worlds/instantiations, which are not query/evidence specific. In this paper, we extract groundedexplanations, i.e., explanations defined w.r.t specific inference queries and observed evidence. We extract these explanations from importance weights defined over the MLN formulas that encode the contribution of formulas towards the final inference results. We validate our approach in real world problems related to analyzing reviews from Yelp, and show through userstudies that our explanations are richer than stateoftheart nonrelational explainers such as LIME .more » « less

We consider the problem of computing rth order statistics, namely finding an assignment having rank r in a probabilistic graphical model. We show that the problem is NPhard even when the graphical model has no edges (zerotreewidth models) via a reduction from the partition problem. We use this reduction, specifically a pseudopolynomial time algorithm for number partitioning to yield a pseudopolynomial time approximation algorithm for solving the rth order statistics problem in zero treewidth models. We then extend this algorithm to arbitrary graphical models by generalizing it to tree decompositions, and demonstrate via experimental evaluation on various datasets that our proposed algorithm is more accurate than sampling algorithms.

Statistical relational learning models are powerful tools that combine ideas from firstorder logic with probabilistic graphical models to represent complex dependencies. Despite their success in encoding large problems with a compact set of weighted rules, performing inference over these models is often challenging. In this paper, we show how to effectively combine two powerful ideas for scaling inference for large graphical models. The first idea, lifted inference, is a wellstudied approach to speeding up inference in graphical models by exploiting symmetries in the underlying problem. The second idea is to frame Maximum a posteriori (MAP) inference as a convex optimization problem and use alternating direction method of multipliers (ADMM) to solve the problem in parallel. A wellstudied relaxation to the combinatorial optimization problem defined for logical Markov random fields gives rise to a hingeloss Markov random field (HLMRF) for which MAP inference is a convex optimization problem. We show how the formalism introduced for coloring weighted bipartite graphs using a color refinement algorithm can be integrated with the ADMM optimization technique to take advantage of the sparse dependency structures of HLMRFs. Our proposed approach, lifted hingeloss Markov random fields (LHLMRFs), preserves the structure of the original problem after lifting and solves lifted inference as distributed convex optimization with ADMM. In our empirical evaluation on realworld problems, we observe up to a three times speed up in inference over HLMRFs.more » « less

Abstract Phylogenetic and discretetrait evolutionary inference depend heavily on an appropriate characterization of the underlying character substitution process. In this paper, we present randomeffects substitution models that extend common continuoustime Markov chain models into a richer class of processes capable of capturing a wider variety of substitution dynamics. As these randomeffects substitution models often require many more parameters than their usual counterparts, inference can be both statistically and computationally challenging. Thus, we also propose an efficient approach to compute an approximation to the gradient of the data likelihood with respect to all unknown substitution model parameters. We demonstrate that this approximate gradient enables scaling of samplingbased inference, namely Bayesian inference via Hamiltonian Monte Carlo, under randomeffects substitution models across large trees and statespaces.
Applied to a dataset of 583 SARSCoV2 sequences, an HKY model with randomeffects shows strong signals of nonreversibility in the substitution process, and posterior predictive model checks clearly show that it is a more adequate model than a reversible model. When analyzing the pattern of phylogeographic spread of 1441 influenza A virus (H3N2) sequences between 14 regions, a randomeffects phylogeographic substitution model infers that air travel volume adequately predicts almost all dispersal rates. A randomeffects statedependent substitution model reveals no evidence for an effect of arboreality on the swimming mode in the tree frog subfamily Hylinae. Simulations reveal that randomeffects substitution models can accommodate both negligible and radical departures from the underlying base substitution model. We show that our gradientbased inference approach is over an order of magnitude more time efficient than conventional approaches.

We consider messageefficient continuous random sampling from a distributed stream, where the probability of inclusion of an item in the sample is proportional to a weight associated with the item. The unweighted version, where all weights are equal, is well studied, and admits tight upper and lower bounds on message complexity. For weighted sampling with replacement, there is a simple reduction to unweighted sampling with replacement. However, in many applications the stream may have only a few heavy items which may dominate a random sample when chosen with replacement. Weighted sampling without replacement (weighted SWOR) eludes this issue, since such heavy items can be sampled at most once. In this work, we present the first messageoptimal algorithm for weighted SWOR from a distributed stream. Our algorithm also has optimal space and time complexity. As an application of our algorithm for weighted SWOR, we derive the first distributed streaming algorithms for tracking heavy hitters with residual error. Here the goal is to identify stream items that contribute significantly to the residual stream, once the heaviest items are removed. Residual heavy hitters generalize the notion of $\ell_1$ heavy hitters and are important in streams that have a skewed distribution of weights. In addition to the upper bound, we also provide a lower bound on the message complexity that is nearly tight up to a $łog(1/\eps)$ factor. Finally, we use our weighted sampling algorithm to improve the message complexity of distributed $L_1$ tracking, also known as count tracking, which is a widely studied problem in distributed streaming. We also derive a tight message lower bound, which closes the message complexity of this fundamental problem.more » « less