Title: Evaluating Stability in Massive Social Networks: Efficient Streaming Algorithms for Structural Balance
Structural balance theory studies stability in networks. Given a n-vertex complete graph G = (V,E) whose edges are labeled positive or negative, the graph is considered balanced if every triangle either consists of three positive edges (three mutual "friends"), or one positive edge and two negative edges (two "friends" with a common "enemy"). From a computational perspective, structural balance turns out to be a special case of correlation clustering with the number of clusters at most two. The two main algorithmic problems of interest are: (i) detecting whether a given graph is balanced, or (ii) finding a partition that approximates the frustration index, i.e., the minimum number of edge flips that turn the graph balanced. We study these problems in the streaming model where edges are given one by one and focus on memory efficiency. We provide randomized single-pass algorithms for: (i) determining whether an input graph is balanced with O(log n) memory, and (ii) finding a partition that induces a (1 + ε)-approximation to the frustration index with O(n ⋅ polylog(n)) memory. We further provide several new lower bounds, complementing different aspects of our algorithms such as the need for randomization or approximation. To obtain our main results, we develop a method using pseudorandom generators (PRGs) to sample edges between independently-chosen vertices in graph streaming. Furthermore, our algorithm that approximates the frustration index improves the running time of the state-of-the-art correlation clustering with two clusters (Giotis-Guruswami algorithm [SODA 2006]) from n^O(1/ε²) to O(n²log³n/ε² + n log n ⋅ (1/ε)^O(1/ε⁴)) time for (1+ε)-approximation. These results may be of independent interest. more »« less
Streaming algorithmsstructural balancepseudo-randomness generatorTheory of computation → Streaming, sublinear and near linear time algorithmsTheory of computation → Pseudorandomness and derandomization
Charikar, Moses; Waingarten, Erik
(, Schloss Dagstuhl – Leibniz-Zentrum für Informatik)
Bojańczyk, Mikołaj; Merelli, Emanuela; Woodruff, David P
(Ed.)
Given n points in 𝓁_p^d, we consider the problem of partitioning points into k clusters with associated centers. The cost of a clustering is the sum of p-th powers of distances of points to their cluster centers. For p ∈ [1,2], we design sketches of size poly(log(nd),k,1/ε) such that the cost of the optimal clustering can be estimated to within factor 1+ε, despite the fact that the compressed representation does not contain enough information to recover the cluster centers or the partition into clusters. This leads to a streaming algorithm for estimating the clustering cost with space poly(log(nd),k,1/ε). We also obtain a distributed memory algorithm, where the n points are arbitrarily partitioned amongst m machines, each of which sends information to a central party who then computes an approximation of the clustering cost. Prior to this work, no such streaming or distributed-memory algorithm was known with sublinear dependence on d for p ∈ [1,2).
Chuzhoy, Julia; Gao, Yu; Li, Jason; Nanongkai, Danupon; Peng, Richard; Saranurak, Thatchaphol
(, 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, November 16-19, 2020)
null
(Ed.)
We consider the classical Minimum Balanced Cut problem: given a graph $$G$$, compute a partition of its vertices into two subsets of roughly equal volume, while minimizing the number of edges connecting the subsets. We present the first {\em deterministic, almost-linear time} approximation algorithm for this problem. Specifically, our algorithm, given an $$n$$-vertex $$m$$-edge graph $$G$$ and any parameter $$1\leq r\leq O(\log n)$$, computes a $$(\log m)^{r^2}$$-approximation for Minimum Balanced Cut on $$G$$, in time $$O\left ( m^{1+O(1/r)+o(1)}\cdot (\log m)^{O(r^2)}\right )$$. In particular, we obtain a $$(\log m)^{1/\epsilon}$$-approximation in time $$m^{1+O(1/\sqrt{\epsilon})}$$ for any constant $$\epsilon$$, and a $$(\log m)^{f(m)}$$-approximation in time $$m^{1+o(1)}$$, for any slowly growing function $$m$$. We obtain deterministic algorithms with similar guarantees for the Sparsest Cut and the Lowest-Conductance Cut problems. Our algorithm for the Minimum Balanced Cut problem in fact provides a stronger guarantee: it either returns a balanced cut whose value is close to a given target value, or it certifies that such a cut does not exist by exhibiting a large subgraph of $$G$$ that has high conductance. We use this algorithm to obtain deterministic algorithms for dynamic connectivity and minimum spanning forest, whose worst-case update time on an $$n$$-vertex graph is $$n^{o(1)}$$, thus resolving a major open problem in the area of dynamic graph algorithms. Our work also implies deterministic algorithms for a host of additional problems, whose time complexities match, up to subpolynomial in $$n$$ factors, those of known randomized algorithms. The implications include almost-linear time deterministic algorithms for solving Laplacian systems and for approximating maximum flows in undirected graphs.
Inamdar, Tanmay; Lokshtanov, Daniel; Saurabh, Saket; Surianarayanan, Vaishali
(, Schloss Dagstuhl – Leibniz-Zentrum für Informatik)
Gørtz, Inge Li; Farach-Colton, Martin; Puglisi, Simon J; Herman, Grzegorz
(Ed.)
In the Minimum Bisection problem input is a graph G and the goal is to partition the vertex set into two parts A and B, such that ||A|-|B|| ≤ 1 and the number k of edges between A and B is minimized. The problem is known to be NP-hard, and assuming the Unique Games Conjecture even NP-hard to approximate within a constant factor [Khot and Vishnoi, J.ACM'15]. On the other hand, a 𝒪(log n)-approximation algorithm [Räcke, STOC'08] and a parameterized algorithm [Cygan et al., ACM Transactions on Algorithms'20] running in time k^𝒪(k) n^𝒪(1) is known. The Minimum Bisection problem can be viewed as a clustering problem where edges represent similarity and the task is to partition the vertices into two equally sized clusters while minimizing the number of pairs of similar objects that end up in different clusters. Motivated by a number of egregious examples of unfair bias in AI systems, many fundamental clustering problems have been revisited and re-formulated to incorporate fairness constraints. In this paper we initiate the study of the Minimum Bisection problem with fairness constraints. Here the input is a graph G, positive integers c and k, a function χ:V(G) → {1, …, c} that assigns a color χ(v) to each vertex v in G, and c integers r_1,r_2,⋯,r_c. The goal is to partition the vertex set of G into two almost-equal sized parts A and B with at most k edges between them, such that for each color i ∈ {1, …, c}, A has exactly r_i vertices of color i. Each color class corresponds to a group which we require the partition (A, B) to treat fairly, and the constraints that A has exactly r_i vertices of color i can be used to encode that no group is over- or under-represented in either of the two clusters. We first show that introducing fairness constraints appears to make the Minimum Bisection problem qualitatively harder. Specifically we show that unless FPT=W[1] the problem admits no f(c)n^𝒪(1) time algorithm even when k = 0. On the other hand, our main technical contribution shows that is that this hardness result is simply a consequence of the very strict requirement that each color class i has exactly r_i vertices in A. In particular we give an f(k,c,ε)n^𝒪(1) time algorithm that finds a balanced partition (A, B) with at most k edges between them, such that for each color i ∈ [c], there are at most (1±ε)r_i vertices of color i in A. Our approximation algorithm is best viewed as a proof of concept that the technique introduced by [Lampis, ICALP'18] for obtaining FPT-approximation algorithms for problems of bounded tree-width or clique-width can be efficiently exploited even on graphs of unbounded width. The key insight is that the technique of Lampis is applicable on tree decompositions with unbreakable bags (as introduced in [Cygan et al., SIAM Journal on Computing'14]). An important ingredient of our approximation scheme is a combinatorial result that may be of independent interest, namely that for every k, every graph G admits a tree decomposition with adhesions of size at most 𝒪(k), unbreakable bags, and logarithmic depth.
Carlson, Charlie; Jafarov, Jafar; Makarychev, Konstantin; Makarychev, Yury; Shan, Liren
(, Proceedings of the European Symposium on Algorithms)
Gørtz, Inge Li; Farach-Colton, Martin; Puglisi, Simon J; Herman, Grzegorz
(Ed.)
We consider variants of the classic Multiway Cut problem. Multiway Cut asks to partition a graph G into k parts so as to separate k given terminals. Recently, Chandrasekaran and Wang (ESA 2021) introduced l_p-norm Multiway Cut, a generalization of the problem, in which the goal is to minimize the l_p norm of the edge boundaries of k parts. We provide an O(log^{1/2} n log^{1/2 + 1/p} k) approximation algorithm for this problem, improving upon the approximation guarantee of O(log^{3/2} n log^{1/2} k) due to Chandrasekaran and Wang. We also introduce and study Norm Multiway Cut, a further generalization of Multiway Cut. We assume that we are given access to an oracle, which answers certain queries about the norm. We present an O(log^{1/2} n log^{7/2} k) approximation algorithm with a weaker oracle and an O(log^{1/2} n log^{5/2} k) approximation algorithm with a stronger oracle. Additionally, we show that without any oracle access, there is no n^{1/4-ε} approximation algorithm for every ε > 0 assuming the Hypergraph Dense-vs-Random Conjecture.
Ding, Matthew; Garces, Alexandro; Li, Jason; Lin, Honghao; Nelson, Jelani; Shah, Vihan; Woodruff, David P
(, Schloss Dagstuhl – Leibniz-Zentrum für Informatik)
Meka, Raghu
(Ed.)
We consider the problem of finding a minimum cut of a weighted graph presented as a single-pass stream. While graph sparsification in streams has been intensively studied, the specific application of finding minimum cuts in streams is less well-studied. To this end, we show upper and lower bounds on minimum cut problems in insertion-only streams for a variety of settings, including for both randomized and deterministic algorithms, for both arbitrary and random order streams, and for both approximate and exact algorithms. One of our main results is an Õ(n/ε) space algorithm with fast update time for approximating a spectral cut query with high probability on a stream given in an arbitrary order. Our result breaks the Ω(n/ε²) space lower bound required of a sparsifier that approximates all cuts simultaneously. Using this result, we provide streaming algorithms with near optimal space of Õ(n/ε) for minimum cut and approximate all-pairs effective resistances, with matching space lower-bounds. The amortized update time of our algorithms is Õ(1), provided that the number of edges in the input graph is at least (n/ε²)^{1+o(1)}. We also give a generic way of incorporating sketching into a recursive contraction algorithm to improve the post-processing time of our algorithms. In addition to these results, we give a random-order streaming algorithm that computes the exact minimum cut on a simple, unweighted graph using Õ(n) space. Finally, we give an Ω(n/ε²) space lower bound for deterministic minimum cut algorithms which matches the best-known upper bound up to polylogarithmic factors.
Ashvinkumar, Vikrant, Assadi, Sepehr, Deng, Chengyuan, Gao, Jie, and Wang, Chen.
"Evaluating Stability in Massive Social Networks: Efficient Streaming Algorithms for Structural Balance". APPROX-RANDOM (). Country unknown/Code not available: Schloss Dagstuhl – Leibniz-Zentrum für Informatik. https://doi.org/10.4230/LIPICS.APPROX/RANDOM.2023.58.https://par.nsf.gov/biblio/10488810.
@article{osti_10488810,
place = {Country unknown/Code not available},
title = {Evaluating Stability in Massive Social Networks: Efficient Streaming Algorithms for Structural Balance},
url = {https://par.nsf.gov/biblio/10488810},
DOI = {10.4230/LIPICS.APPROX/RANDOM.2023.58},
abstractNote = {Structural balance theory studies stability in networks. Given a n-vertex complete graph G = (V,E) whose edges are labeled positive or negative, the graph is considered balanced if every triangle either consists of three positive edges (three mutual "friends"), or one positive edge and two negative edges (two "friends" with a common "enemy"). From a computational perspective, structural balance turns out to be a special case of correlation clustering with the number of clusters at most two. The two main algorithmic problems of interest are: (i) detecting whether a given graph is balanced, or (ii) finding a partition that approximates the frustration index, i.e., the minimum number of edge flips that turn the graph balanced. We study these problems in the streaming model where edges are given one by one and focus on memory efficiency. We provide randomized single-pass algorithms for: (i) determining whether an input graph is balanced with O(log n) memory, and (ii) finding a partition that induces a (1 + ε)-approximation to the frustration index with O(n ⋅ polylog(n)) memory. We further provide several new lower bounds, complementing different aspects of our algorithms such as the need for randomization or approximation. To obtain our main results, we develop a method using pseudorandom generators (PRGs) to sample edges between independently-chosen vertices in graph streaming. Furthermore, our algorithm that approximates the frustration index improves the running time of the state-of-the-art correlation clustering with two clusters (Giotis-Guruswami algorithm [SODA 2006]) from n^O(1/ε²) to O(n²log³n/ε² + n log n ⋅ (1/ε)^O(1/ε⁴)) time for (1+ε)-approximation. These results may be of independent interest.},
journal = {APPROX-RANDOM},
publisher = {Schloss Dagstuhl – Leibniz-Zentrum für Informatik},
author = {Ashvinkumar, Vikrant and Assadi, Sepehr and Deng, Chengyuan and Gao, Jie and Wang, Chen},
editor = {Megow, Nicole and Smith, Adam}
}
Warning: Leaving National Science Foundation Website
You are now leaving the National Science Foundation website to go to a non-government website.
Website:
NSF takes no responsibility for and exercises no control over the views expressed or the accuracy of
the information contained on this site. Also be aware that NSF's privacy policy does not apply to this site.