skip to main content

Title: Correlated Equilibria for Approximate Variational Inference in MRFs
Almost all of the work in graphical models for game theory has mirrored previous work in probabilistic graphical models. Our work considers the opposite direction: Taking advantage of advances in equilibrium computation for probabilistic inference. In particular, we present formulations of inference problems in Markov random fields (MRFs) as computation of equilibria in a certain class of game-theoretic graphical models. While some previous work explores this direction, we still lack a more precise connection between variational probabilistic inference in MRFs and correlated equilibria. This paper sharpens the connection, which helps us exploit relatively more recent theoretical and empirical results from the literature on algorithmic and computational game theory on the tractable, polynomial-time computation of exact or approximate correlated equilibria in graphical games with arbitrary, loopy graph structure. Our work discusses how to design new algorithms with equally tractable guarantees for the computation of approximate variational inference in MRFs. In addition, inspired by a previously stated game-theoretic view of tree-reweighted message-passing techniques for belief inference as a zero-sum game, we propose a different, general-sum potential game to design approximate fictitious-play techniques. Empirical evaluations on synthetic experiments and on an application to soft de-noising on real-world image datasets illustrate the performance of our proposed approach and shed some light on the conditions under which the resulting belief inference algorithms may be most effective relative to standard state-of-the-art methods.  more » « less
Award ID(s):
1907553 1643006 1054541
Author(s) / Creator(s):
; ;
Jaeger, Manfred; Nielsen, Thomas Dyhre
Date Published:
Journal Name:
Proceedings of Machine Learning Research
Page Range / eLocation ID:
329 - 340
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Probabilistic graphical models, such as Markov random fields (MRF), exploit dependencies among random variables to model a rich family of joint probability distributions. Inference algorithms, such as belief propagation (BP), can effectively compute the marginal posteriors for decision making. Nonetheless, inferences involve sophisticated probability calculations and are difficult for humans to interpret. Among all existing explanation methods for MRFs, no method is designed for fair attributions of an inference outcome to elements on the MRF where the inference takes place. Shapley values provide rigorous attributions but so far have not been studied on MRFs. We thus define Shapley values for MRFs to capture both probabilistic and topological contributions of the variables on MRFs. We theoretically characterize the new definition regarding independence, equal contribution, additivity, and submodularity. As brute-force computation of the Shapley values is challenging, we propose GraphShapley, an approximation algorithm that exploits the decomposability of Shapley values, the structure of MRFs, and the iterative nature of BP inference to speed up the computation. In practice, we propose meta-explanations to explain the Shapley values and make them more accessible and trustworthy to human users. On four synthetic and nine real-world MRFs, we demonstrate that GraphShapley generates sensible and practical explanations. 
    more » « less
  2. Statistical relational learning models are powerful tools that combine ideas from first-order logic with probabilistic graphical models to represent complex dependencies. Despite their success in encoding large problems with a compact set of weighted rules, performing inference over these models is often challenging. In this paper, we show how to effectively combine two powerful ideas for scaling inference for large graphical models. The first idea, lifted inference, is a wellstudied approach to speeding up inference in graphical models by exploiting symmetries in the underlying problem. The second idea is to frame Maximum a posteriori (MAP) inference as a convex optimization problem and use alternating direction method of multipliers (ADMM) to solve the problem in parallel. A well-studied relaxation to the combinatorial optimization problem defined for logical Markov random fields gives rise to a hinge-loss Markov random field (HLMRF) for which MAP inference is a convex optimization problem. We show how the formalism introduced for coloring weighted bipartite graphs using a color refinement algorithm can be integrated with the ADMM optimization technique to take advantage of the sparse dependency structures of HLMRFs. Our proposed approach, lifted hinge-loss Markov random fields (LHL-MRFs), preserves the structure of the original problem after lifting and solves lifted inference as distributed convex optimization with ADMM. In our empirical evaluation on real-world problems, we observe up to a three times speed up in inference over HL-MRFs. 
    more » « less
  3. One of the primary challenges in graphical models is inference, or re-constructing a marginal probability from the graphical model’s factorized representation. While tractable for some graphs, the cost of inference grows exponentially with the graphical model’s complexity, necessitating approximation for more complex graphs. For interactive applications, latency is the dominant concern, making approximate inference the only feasible option. Unfortunately, approximate inference can be wasteful for interactive applications, as exact inference can still converge faster, even for moderately complex inference problems. In this paper, we propose a new family of convergent inference algorithms (CIAs) that bridge the gap between approximations and exact solutions, providing early, incrementally improving approximations that become exact after a finite period of time. We describe two specific CIAs based on a cryptographic technique called linear congruential generators, including a novel incremental join algorithm for dense relations called Leaky Joins. We conclude with experiments that demonstrate the utility of Leaky Joins for convergent inference: On both synthetic and real-world probabilistic graphical models, Leaky Joins converge to exact marginal probabilities almost as fast as state of the art exact inference algorithms, while simultaneously achieving approximations that are almost as good as state of the art approximation algorithms. 
    more » « less
  4. Bayesian models are established as one of the main successful paradigms for complex problems in machine learning. To handle intractable inference, research in this area has developed new approximation methods that are fast and effective. However, theoretical analysis of the performance of such approximations is not well developed. The paper furthers such analysis by providing bounds on the excess risk of variational inference algorithms and related regularized loss minimization algorithms for a large class of latent variable models with Gaussian latent variables. We strengthen previous results for variational algorithms by showing they are competitive with any point-estimate predictor. Unlike previous work, we also provide bounds on the risk of the \emph{Bayesian} predictor and not just the risk of the Gibbs predictor for the same approximate posterior. The bounds are applied in complex models including sparse Gaussian processes and correlated topic models. Theoretical results are complemented by identifying novel approximations to the Bayesian objective that attempt to minimize the risk directly. An empirical evaluation compares the variational and new algorithms shedding further light on their performance. 
    more » « less
  5. Gradient-based approximate inference methods, such as Stein variational gradient descent (SVGD), provide simple and general-purpose inference engines for differentiable continuous distributions. However, existing forms of SVGD cannot be directly applied to discrete distributions. In this work, we fill this gap by proposing a simple yet general framework that transforms discrete distributions to equivalent piecewise continuous distributions, on which the gradient-free SVGD is applied to perform efficient approximate inference. The empirical results show that our method outperforms traditional algorithms such as Gibbs sampling and discontinuous Hamiltonian Monte Carlo on various challenging benchmarks of discrete graphical models. We demonstrate that our method provides a promising tool for learning ensembles of binarized neural network (BNN), outperforming other widely used ensemble methods on learning binarized AlexNet on CIFAR-10 dataset. In addition, such transform can be straightforwardly employed in gradient-free kernelized Stein discrepancy to perform goodness-of-fit (GOF) test on discrete distributions. Our proposed method outperforms existing GOF test methods for intractable discrete distributions. 
    more » « less