skip to main content


Title: Densest Subgraph: Supermodularity, Iterative Peeling, and Flow
The densest subgraph problem in a graph (\dsg), in the simplest form, is the following. Given an undirected graph $G=(V,E)$ find a subset $S \subseteq V$ of vertices that maximizes the ratio $|E(S)|/|S|$ where $E(S)$ is the set of edges with both endpoints in $S$. \dsg and several of its variants are well-studied in theory and practice and have many applications in data mining and network analysis. In this paper we study fast algorithms and structural aspects of \dsg via the lens of \emph{supermodularity}. For this we consider the densest supermodular subset problem (\dssp): given a non-negative supermodular function $f: 2^V \rightarrow \mathbb{R}_+$, maximize $f(S)/|S|$. For \dsg we describe a simple flow-based algorithm that outputs a $(1-\eps)$-approximation in deterministic $\tilde{O}(m/\eps)$ time where $m$ is the number of edges. Our algorithm is the first to have a near-linear dependence on $m$ and $1/\eps$ and improves previous methods based on an LP relaxation. It generalizes to hypergraphs, and also yields a faster algorithm for directed \dsg. Greedy peeling algorithms have been very popular for \dsg and several variants due to their efficiency, empirical performance, and worst-case approximation guarantees. We describe a simple peeling algorithm for \dssp and analyze its approximation guarantee in a fashion that unifies several existing results. Boob et al.\ \cite{bgpstww-20} developed an \emph{iterative} peeling algorithm for \dsg which appears to work very well in practice, and made a conjecture about its convergence to optimality. We affirmatively answer their conjecture, and in fact prove that a natural generalization of their algorithm converges to a $(1-\eps)$-approximation for \emph{any} supermodular function $f$; the key to our proof is to consider an LP formulation that is derived via the \Lovasz extension of a supermodular function. For \dsg the bound on the number of iterations we prove is $O(\frac{\Delta \ln |V|}{\lambda^*}\cdot \frac{1}{\eps^2})$ where $\Delta$ is the maximum degree and $\lambda^*$ is the optimum value. Our work suggests that iterative peeling can be an effective heuristic for several objectives considered in the literature. Finally, we show that the $2$-approximation for densest-at-least-$k$ subgraph \cite{ks-09} extends to the supermodular setting. We also give a unified analysis of the peeling algorithm for this problem, and via this analysis derive an approximation guarantee for a generalization of \dssp to maximize $f(S)/g(|S|)$ for a concave function $g$.  more » « less
Award ID(s):
1910149 2129816
NSF-PAR ID:
10348209
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the 2022 ACM-SIAM Symposium on Discrete Algorithms
Page Range / eLocation ID:
1531 - 1555
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Megow, Nicole ; Smith, Adam (Ed.)
    Maximum weight independent set (MWIS) admits a 1/k-approximation in inductively k-independent graphs [Karhan Akcoglu et al., 2002; Ye and Borodin, 2012] and a 1/(2k)-approximation in k-perfectly orientable graphs [Kammer and Tholey, 2014]. These are a parameterized class of graphs that generalize k-degenerate graphs, chordal graphs, and intersection graphs of various geometric shapes such as intervals, pseudo-disks, and several others [Ye and Borodin, 2012; Kammer and Tholey, 2014]. We consider a generalization of MWIS to a submodular objective. Given a graph G = (V,E) and a non-negative submodular function f: 2^V → ℝ_+, the goal is to approximately solve max_{S ∈ ℐ_G} f(S) where ℐ_G is the set of independent sets of G. We obtain an Ω(1/k)-approximation for this problem in the two mentioned graph classes. The first approach is via the multilinear relaxation framework and a simple contention resolution scheme, and this results in a randomized algorithm with approximation ratio at least 1/e(k+1). This approach also yields parallel (or low-adaptivity) approximations. Motivated by the goal of designing efficient and deterministic algorithms, we describe two other algorithms for inductively k-independent graphs that are inspired by work on streaming algorithms: a preemptive greedy algorithm and a primal-dual algorithm. In addition to being simpler and faster, these algorithms, in the monotone submodular case, yield the first deterministic constant factor approximations for various special cases that have been previously considered such as intersection graphs of intervals, disks and pseudo-disks. 
    more » « less
  2. Gørtz, Inge Li ; Farach-Colton, Martin ; Puglisi, Simon J. ; Herman, Grzegorz (Ed.)
    Boob et al. [Boob et al., 2020] described an iterative peeling algorithm called Greedy++ for the Densest Subgraph Problem (DSG) and conjectured that it converges to an optimum solution. Chekuri, Qaunrud and Torres [Chandra Chekuri et al., 2022] extended the algorithm to supermodular density problems (of which DSG is a special case) and proved that the resulting algorithm Super-Greedy++ (and hence also Greedy++) converges. In this paper we revisit the convergence proof and provide a different perspective. This is done via a connection to Fujishige’s quadratic program for finding a lexicographically optimal base in a (contra) polymatroid [Satoru Fujishige, 1980], and a noisy version of the Frank-Wolfe method from convex optimization [Frank and Wolfe, 1956; Jaggi, 2013]. This yields a simpler convergence proof, and also shows a stronger property that Super-Greedy++ converges to the optimal dense decomposition vector, answering a question raised in Harb et al. [Harb et al., 2022]. A second contribution of the paper is to understand Thorup’s work on ideal tree packing and greedy tree packing [Thorup, 2007; Thorup, 2008] via the Frank-Wolfe algorithm applied to find a lexicographically optimum base in the graphic matroid. This yields a simpler and transparent proof. The two results appear disparate but are unified via Fujishige’s result and convex optimization. 
    more » « less
  3. Abstract

    For a subgraph$G$of the blow-up of a graph$F$, we let$\delta ^*(G)$be the smallest minimum degree over all of the bipartite subgraphs of$G$induced by pairs of parts that correspond to edges of$F$. Johansson proved that if$G$is a spanning subgraph of the blow-up of$C_3$with parts of size$n$and$\delta ^*(G) \ge \frac{2}{3}n + \sqrt{n}$, then$G$contains$n$vertex disjoint triangles, and presented the following conjecture of Häggkvist. If$G$is a spanning subgraph of the blow-up of$C_k$with parts of size$n$and$\delta ^*(G) \ge \left(1 + \frac 1k\right)\frac n2 + 1$, then$G$contains$n$vertex disjoint copies of$C_k$such that each$C_k$intersects each of the$k$parts exactly once. A similar conjecture was also made by Fischer and the case$k=3$was proved for large$n$by Magyar and Martin.

    In this paper, we prove the conjecture of Häggkvist asymptotically. We also pose a conjecture which generalises this result by allowing the minimum degree conditions in each bipartite subgraph induced by pairs of parts of$G$to vary. We support this new conjecture by proving the triangle case. This result generalises Johannson’s result asymptotically.

     
    more » « less
  4. null (Ed.)
    Diffusion of information in social network has been the focus of intense research in the recent past decades due to its significant impact in shaping public discourse through group/individual influence. Existing research primarily models influence as a binary property of entities: influenced or not influenced. While this is a useful abstraction, it discards the notion of degree of influence, i.e., certain individuals may be influenced ``more'' than others. We introduce the notion of \emph{attitude}, which, as described in social psychology, is the degree by which an entity is influenced by the information. Intuitively, attitude captures the number of distinct neighbors of an entity influencing the latter. We present an information diffusion model (AIC model) that quantifies the degree of influence, i.e., attitude of individuals, in a social network. With this model, we formulate and study attitude maximization problem. We prove that the function for computing attitude is monotonic and sub-modular, and the attitude maximization problem is NP-Hard. We present a greedy algorithm for maximization with an approximation guarantee of $(1-1/e)$. In the context of AIC model, we study two problems, with the aim to investigate the scenarios where attaining individuals with high attitude is objectively more important than maximizing the attitude of the entire network. In the first problem, we introduce the notion of \emph{actionable attitude}; intuitively, individuals with actionable attitude are likely to ``act'' on their attained attitude. We show that the function for computing actionable attitude, unlike that for computing attitude, is non-submodular and however is \emph{approximately submodular}. We present approximation algorithm for maximizing actionable attitude in a network. In the second problem, we consider identifying the number of individuals in the network with attitude above a certain value, a threshold. In this context, the function for computing the number of individuals with attitude above a given threshold induced by a seed set is \emph{neither submodular nor supermodular}. We present heuristics for realizing the solution to the problem. We experimentally evaluated our algorithms and studied empirical properties of the attitude of nodes in network such as spatial and value distribution of high attitude nodes. 
    more » « less
  5. In recent years several compressed indexes based on variants of the Burrows-Wheeler transformation have been introduced. Some of these are used to index structures far more complex than a single string, as was originally done with the FM-index [Ferragina and Manzini, J. ACM 2005]. As such, there has been an increasing effort to better understand under which conditions such an indexing scheme is possible. This has led to the introduction of Wheeler graphs [Gagie et al., Theor. Comput. Sci., 2017]. Gagie et al. showed that de Bruijn graphs, generalized compressed suffix arrays, and several other BWT related structures can be represented as Wheeler graphs and that Wheeler graphs can be indexed in a way which is space-efficient. Hence, being able to recognize whether a given graph is a Wheeler graph, or being able to approximate a given graph by a Wheeler graph, could have numerous applications in indexing. Here we resolve the open question of whether there exists an efficient algorithm for recognizing if a given graph is a Wheeler graph. We present - The problem of recognizing whether a given graph G=(V,E) is a Wheeler graph is NP-complete for any edge label alphabet of size sigma >= 2, even when G is a DAG. This holds even on a restricted, subset of graphs called d-NFA's for d >= 5. This is in contrast to recent results demonstrating the problem can be solved in polynomial time for d-NFA's where d <= 2. We also show the recognition problem can be solved in linear time for sigma =1; - There exists an 2^{e log sigma + O(n + e)} time exact algorithm where n = |V| and e = |E|. This algorithm relies on graph isomorphism being computable in strictly sub-exponential time; - We define an optimization variant of the problem called Wheeler Graph Violation, abbreviated WGV, where the aim is to remove the minimum number of edges in order to obtain a Wheeler graph. We show WGV is APX-hard, even when G is a DAG, implying there exists a constant C >= 1 for which there is no C-approximation algorithm (unless P = NP). Also, conditioned on the Unique Games Conjecture, for all C >= 1, it is NP-hard to find a C-approximation; - We define the Wheeler Subgraph problem, abbreviated WS, where the aim is to find the largest subgraph which is a Wheeler Graph (the dual of the WGV). In contrast to WGV, we prove that the WS problem is in APX for sigma=O(1); The above findings suggest that most problems under this theme are computationally difficult. However, we identify a class of graphs for which the recognition problem is polynomial-time solvable, raising the open question of which parameters determine this problem's difficulty. 
    more » « less