skip to main content

Title: Order Statistics for Probabilistic Graphical Models

We consider the problem of computing r-th order statistics, namely finding an assignment having rank r in a probabilistic graphical model. We show that the problem is NP-hard even when the graphical model has no edges (zero-treewidth models) via a reduction from the partition problem. We use this reduction, specifically a pseudo-polynomial time algorithm for number partitioning to yield a pseudo-polynomial time approximation algorithm for solving the r-th order statistics problem in zero- treewidth models. We then extend this algorithm to arbitrary graphical models by generalizing it to tree decompositions, and demonstrate via experimental evaluation on various datasets that our proposed algorithm is more accurate than sampling algorithms.

more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Page Range / eLocation ID:
4625 to 4631
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We address the problem of scaling up local-search or sampling-based inference in Markov logic networks (MLNs) that have large shared sub-structures but no (or few) tied weights. Such untied MLNs are ubiquitous in practical applications. However, they have very few symmetries, and as a result lifted inference algorithms--the dominant approach for scaling up inference--perform poorly on them. The key idea in our approach is to reduce the hard, time-consuming sub-task in sampling algorithms, computing the sum of weights of features that satisfy a full assignment, to the problem of computing a set of partition functions of graphical models, each defined over the logical variables in a first-order formula. The importance of this reduction is that when the treewidth of all the graphical models is small, it yields an order of magnitude speedup. When the treewidth is large, we propose an over-symmetric approximation and experimentally demonstrate that it is both fast and accurate. 
    more » « less
  2. Abstract

    The tree‐independence number , first defined and studied by Dallard, Milanič, and Štorgel, is a variant of treewidth tailored to solving the maximum independent set problem. Over a series of papers, Abrishami et al. developed the so‐called central bag method to study induced obstructions to bounded treewidth. Among others, they showed that, in a certain superclass of (even hole, diamond, pyramid)‐free graphs, treewidth is bounded by a function of the clique number. In this paper, we relax the bounded clique number assumption, and show that has bounded . Via existing results, this yields a polynomial‐time algorithm for the Maximum Weight Independent Set problem in this class. Our result also corroborates, for this class of graphs, a conjecture of Dallard, Milanič, and Štorgel that in a hereditary graph class, is bounded if and only if the treewidth is bounded by a function of the clique number.

    more » « less
  3. Finding the mode of a high dimensional probability distribution $\mathcal{D}$ is a fundamental algorithmic problem in statistics and data analysis. There has been particular interest in efficient methods for solving the problem when $\mathcal{D}$ is represented as a mixture model or kernel density estimate, although few algorithmic results with worst-case approximation and runtime guarantees are known. In this work, we significantly generalize a result of (LeeLiMusco:2021) on mode approximation for Gaussian mixture models. We develop randomized dimensionality reduction methods for mixtures involving a broader class of kernels, including the popular logistic, sigmoid, and generalized Gaussian kernels. As in Lee et al.’s work, our dimensionality reduction results yield quasi-polynomial algorithms for mode finding with multiplicative accuracy $(1-\epsilon)$ for any $\epsilon > 0$. Moreover, when combined with gradient descent, they yield efficient practical heuristics for the problem. In addition to our positive results, we prove a hardness result for box kernels, showing that there is no polynomial time algorithm for finding the mode of a kernel density estimate, unless $\mathit{P} = \mathit{NP}$. Obtaining similar hardness results for kernels used in practice (like Gaussian or logistic kernels) is an interesting future direction. 
    more » « less
  4. A vertex of a plane digraph is bimodal if all its incoming edges (and hence all its outgoing edges) are consecutive in the cyclic order around it. A plane digraph is bimodal if all its vertices are bimodal. Bimodality is at the heart of many types of graph layouts, such as upward drawings, level-planar drawings, and L-drawings. If the graph is not bimodal, the Maximum Bimodal Subgraph (MBS) problem asks for an embedding-preserving bimodal subgraph with the maximum number of edges. We initiate the study of the MBS problem from the parameterized complexity perspective with two main results: (i) we describe an FPT algorithm parameterized by the branchwidth (and hence by the treewidth) of the graph; (ii) we establish that MBS parameterized by the number of non-bimodal vertices admits a polynomial kernel. As the byproduct of these results, we obtain a subexponential FPT algorithm and an efficient polynomial-time approximation scheme for MBS. 
    more » « less
  5. null (Ed.)

    Bipartite b-matching, where agents on one side of a market are matched to one or more agents or items on the other, is a classical model that is used in myriad application areas such as healthcare, advertising, education, and general resource allocation. Traditionally, the primary goal of such models is to maximize a linear function of the constituent matches (e.g., linear social welfare maximization) subject to some constraints. Recent work has studied a new goal of balancing whole-match diversity and economic efficiency, where the objective is instead a monotone submodular function over the matching. Basic versions of this problem are solvable in polynomial time. In this work, we prove that the problem of simultaneously maximizing diversity along several features (e.g., country of citizenship, gender, skills) is NP-hard. To address this problem, we develop the first combinatorial algorithm that constructs provably-optimal diverse b-matchings in pseudo-polynomial time. We also provide a Mixed-Integer Quadratic formulation for the same problem and show that our method guarantees optimal solutions and takes less computation time for a reviewer assignment application. The source code is made available at

    more » « less