skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Thursday, June 12 until 2:00 AM ET on Friday, June 13 due to maintenance. We apologize for the inconvenience.


Title: Densest Subgraph: Supermodularity, Iterative Peeling, and Flow
The densest subgraph problem in a graph (\dsg), in the simplest form, is the following. Given an undirected graph $G=(V,E)$ find a subset $$S \subseteq V$$ of vertices that maximizes the ratio $|E(S)|/|S|$ where $E(S)$ is the set of edges with both endpoints in $$S$$. \dsg and several of its variants are well-studied in theory and practice and have many applications in data mining and network analysis. In this paper we study fast algorithms and structural aspects of \dsg via the lens of \emph{supermodularity}. For this we consider the densest supermodular subset problem (\dssp): given a non-negative supermodular function $$f: 2^V \rightarrow \mathbb{R}_+$, maximize $f(S)/|S|$$. For \dsg we describe a simple flow-based algorithm that outputs a $$(1-\eps)$-approximation in deterministic $$\tilde{O}(m/\eps)$$ time where $$m$$ is the number of edges. Our algorithm is the first to have a near-linear dependence on $$m$$ and $$1/\eps$$ and improves previous methods based on an LP relaxation. It generalizes to hypergraphs, and also yields a faster algorithm for directed \dsg. Greedy peeling algorithms have been very popular for \dsg and several variants due to their efficiency, empirical performance, and worst-case approximation guarantees. We describe a simple peeling algorithm for \dssp and analyze its approximation guarantee in a fashion that unifies several existing results. Boob et al.\ \cite{bgpstww-20} developed an \emph{iterative} peeling algorithm for \dsg which appears to work very well in practice, and made a conjecture about its convergence to optimality. We affirmatively answer their conjecture, and in fact prove that a natural generalization of their algorithm converges to a $$(1-\eps)$$-approximation for \emph{any} supermodular function $$f$$; the key to our proof is to consider an LP formulation that is derived via the \Lovasz extension of a supermodular function. For \dsg the bound on the number of iterations we prove is $$O(\frac{\Delta \ln |V|}{\lambda^*}\cdot \frac{1}{\eps^2})$ where $$\Delta$$ is the maximum degree and $$\lambda^*$$ is the optimum value. Our work suggests that iterative peeling can be an effective heuristic for several objectives considered in the literature. Finally, we show that the $$2$$-approximation for densest-at-least-$$k$$ subgraph \cite{ks-09} extends to the supermodular setting. We also give a unified analysis of the peeling algorithm for this problem, and via this analysis derive an approximation guarantee for a generalization of \dssp to maximize $$f(S)/g(|S|)$ for a concave function $$g$$.  more » « less
Award ID(s):
1910149 2129816
PAR ID:
10348209
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the 2022 ACM-SIAM Symposium on Discrete Algorithms
Page Range / eLocation ID:
1531 - 1555
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Gørtz, Inge Li; Farach-Colton, Martin; Puglisi, Simon J.; Herman, Grzegorz (Ed.)
    Boob et al. [Boob et al., 2020] described an iterative peeling algorithm called Greedy++ for the Densest Subgraph Problem (DSG) and conjectured that it converges to an optimum solution. Chekuri, Qaunrud and Torres [Chandra Chekuri et al., 2022] extended the algorithm to supermodular density problems (of which DSG is a special case) and proved that the resulting algorithm Super-Greedy++ (and hence also Greedy++) converges. In this paper we revisit the convergence proof and provide a different perspective. This is done via a connection to Fujishige’s quadratic program for finding a lexicographically optimal base in a (contra) polymatroid [Satoru Fujishige, 1980], and a noisy version of the Frank-Wolfe method from convex optimization [Frank and Wolfe, 1956; Jaggi, 2013]. This yields a simpler convergence proof, and also shows a stronger property that Super-Greedy++ converges to the optimal dense decomposition vector, answering a question raised in Harb et al. [Harb et al., 2022]. A second contribution of the paper is to understand Thorup’s work on ideal tree packing and greedy tree packing [Thorup, 2007; Thorup, 2008] via the Frank-Wolfe algorithm applied to find a lexicographically optimum base in the graphic matroid. This yields a simpler and transparent proof. The two results appear disparate but are unified via Fujishige’s result and convex optimization. 
    more » « less
  2. Kumar, Amit; Ron-Zewi, Noga (Ed.)
    Dense subgraph discovery is an important problem in graph mining and network analysis with several applications. Two canonical polynomial-time solvable problems here are to find a maxcore (subgraph of maximum min degree) and to find a densest subgraph (subgraph of maximum average degree). Both of these problems can be solved in polynomial time. Veldt, Benson, and Kleinberg [Veldt et al., 2021] introduced the generalized p-mean densest subgraph problem which captures the maxcore problem when p = -∞ and the densest subgraph problem when p = 1. They observed that for p ≥ 1, the objective function is supermodular and hence the problem can be solved in polynomial time. In this work, we focus on the p-mean densest subgraph problem for p ∈ (-∞, 1). We prove that for every p ∈ (-∞,1), the problem is NP-hard, thus resolving an open question from [Veldt et al., 2021]. We also show that for every p ∈ (0,1), the weighted version of the problem is APX-hard. On the algorithmic front, we describe two simple 1/2-approximation algorithms for every p ∈ (-∞, 1). We complement the approximation algorithms by exhibiting non-trivial instances on which the algorithms simultaneously achieve an approximation factor of at most 1/2. 
    more » « less
  3. Given a simple graph $$G$$, the irregularity strength of $$G$$, denoted $s(G)$, is the least positive integer $$k$$ such that there is a weight assignment on edges $$f: E(G) \to \{1,2,\dots, k\}$$ for which each vertex weight $$f^V(v):= \sum_{u: \{u,v\}\in E(G)} f(\{u,v\})$$ is unique amongst all $$v\in V(G)$$. In 1987, Faudree and Lehel conjectured that there is a constant $$c$$ such that $$s(G) \leq n/d + c$$ for all $$d$$-regular graphs $$G$$ on $$n$$ vertices with $d>1$, whereas it is trivial that $$s(G) \geq n/d$$. In this short note we prove that the Faudree-Lehel Conjecture holds when $$d \geq n^{0.8+\epsilon}$$ for any fixed $$\epsilon >0$$, with a small additive constant $c=28$ for $$n$$ large enough. Furthermore, we confirm the conjecture asymptotically by proving that for any fixed $$\beta\in(0,1/4)$$ there is a constant $$C$$ such that for all $$d$$-regular graphs $$G$$, $$s(G) \leq \frac{n}{d}(1+\frac{C}{d^{\beta}})+28$$, extending and improving a recent result of Przybyło that $$s(G) \leq \frac{n}{d}(1+ \frac{1}{\ln^{\epsilon/19}n})$$ whenever $$d\in [\ln^{1+\epsilon} n, n/\ln^{\epsilon}n]$$ and $$n$$ is large enough. 
    more » « less
  4. Transportation cost is an attractive similarity measure between probability distributions due to its many useful theoretical properties. However, solving optimal transport exactly can be prohibitively expensive. Therefore, there has been significant effort towards the design of scalable approximation algorithms. Previous combinatorial results [Sharathkumar, Agarwal STOC '12, Agarwal, Sharathkumar STOC '14] have focused primarily on the design of near-linear time multiplicative approximation algorithms. There has also been an effort to design approximate solutions with additive errors [Cuturi NIPS '13, Altschuler \etal\ NIPS '17, Dvurechensky \etal\, ICML '18, Quanrud, SOSA '19] within a time bound that is linear in the size of the cost matrix and polynomial in C/\delta; here C is the largest value in the cost matrix and \delta is the additive error. We present an adaptation of the classical graph algorithm of Gabow and Tarjan and provide a novel analysis of this algorithm that bounds its execution time by \BigO(\frac{n^2 C}{\delta}+ \frac{nC^2}{\delta^2}). Our algorithm is extremely simple and executes, for an arbitrarily small constant \eps, only \lfloor \frac{2C}{(1-\eps)\delta}\rfloor + 1 iterations, where each iteration consists only of a Dijkstra-type search followed by a depth-first search. We also provide empirical results that suggest our algorithm is competitive with respect to a sequential implementation of the Sinkhorn algorithm in execution time. Moreover, our algorithm quickly computes a solution for very small values of \delta whereas Sinkhorn algorithm slows down due to numerical instability. 
    more » « less
  5. We study the densest subgraph problem and give algorithms via multiplicative weights update and area convexity that converge in $$O\left(\frac{\log m}{\epsilon^{2}}\right)$$ and $$O\left(\frac{\log m}{\epsilon}\right)$$ iterations, respectively, both with nearly-linear time per iteration. Compared with the work by Bahmani et al. (2014), our MWU algorithm uses a very different and much simpler procedure for recovering the dense subgraph from the fractional solution and does not employ a binary search. Compared with the work by Boob et al. (2019), our algorithm via area convexity improves the iteration complexity by a factor $$\Delta$$ — the maximum degree in the graph, and matches the fastest theoretical runtime currently known via flows (Chekuri et al., 2022) in total time. Next, we study the dense subgraph decomposition problem and give the first practical iterative algorithm with linear convergence rate $$O\left(mn\log\frac{1}{\epsilon}\right)$$ via accelerated random coordinate descent. This significantly improves over $$O\left(\frac{m\sqrt{mn\Delta}}{\epsilon}\right)$$ time of the FISTA-based algorithm by Harb et al. (2022). In the high precision regime $$\epsilon\ll\frac{1}{n}$$ where we can even recover the exact solution, our algorithm has a total runtime of $$O\left(mn\log n\right)$$, matching the state of the art exact algorithm via parametric flows (Gallo et al., 1989). Empirically, we show that this algorithm is very practical and scales to very large graphs, and its performance is competitive with widely used methods that have significantly weaker theoretical guarantees. 
    more » « less