skip to main content


Title: Distributed Graph Diameter Approximation
We present an algorithm for approximating the diameter of massive weighted undirected graphs on distributed platforms supporting a MapReduce-like abstraction. In order to be efficient in terms of both time and space, our algorithm is based on a decomposition strategy which partitions the graph into disjoint clusters of bounded radius. Theoretically, our algorithm uses linear space and yields a polylogarithmic approximation guarantee; most importantly, for a large family of graphs, it features a round complexity asymptotically smaller than the one exhibited by a natural approximation algorithm based on the state-of-the-art Δ-stepping SSSP algorithm, which is its only practical, linear-space competitor in the distributed setting. We complement our theoretical findings with a proof-of-concept experimental analysis on large benchmark graphs, which suggests that our algorithm may attain substantial improvements in terms of running time compared to the aforementioned competitor, while featuring, in practice, a similar approximation ratio.  more » « less
Award ID(s):
1813444
PAR ID:
10273680
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Algorithms
Volume:
13
Issue:
9
ISSN:
1999-4893
Page Range / eLocation ID:
216
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider the maximum matching problem in the semi-streaming model formalized by Feigenbaum, Kannan, McGregor, Suri, and Zhang that is inspired by giant graphs of today. As our main result, we give a two-pass (1/2 + 1/16)-approximation algorithm for triangle-free graphs and a two-pass (1/2 + 1/32)-approximation algorithm for general graphs; these improve the approximation ratios of 1/2 + 1/52 for bipartite graphs and 1/2 + 1/140 for general graphs by Konrad, Magniez, and Mathieu. In three passes, we are able to achieve approximation ratios of 1/2 + 1/10 for triangle-free graphs and 1/2 + 1/19.753 for general graphs. We also give a multi-pass algorithm where we bound the number of passes precisely—we give a (2/3 − ε)- approximation algorithm that uses 2/(3ε) passes for triangle-free graphs and 4/(3ε) passes for general graphs. Our algorithms are simple and combinatorial, use O(n log n) space, and (can be implemented to) have O(1) update time per edge. For general graphs, our multi-pass algorithm improves the best known deterministic algorithms in terms of the number of passes: * Ahn and Guha give a (2/3−ε)-approximation algorithm that uses O(log(1/ε)/ε2) passes, whereas our (2/3 − ε)-approximation algorithm uses 4/(3ε) passes; * they also give a (1 − ε)-approximation algorithm that uses O(log n · poly(1/ε)) passes, where n is the number of vertices of the input graph; although our algorithm is (2/3−ε)-approximation, our number of passes do not depend on n. Earlier multi-pass algorithms either have a large constant inside big-O notation for the number of passes or the constant cannot be determined due to the involved analysis, so our multi-pass algorithm should use much fewer passes for approximation ratios bounded slightly below 2/3. 
    more » « less
  2. Over the past decade, there has been increasing interest in distributed/parallel algorithms for processing large-scale graphs. By now, we have quite fast algorithms---usually sublogarithmic-time and often poly(łogłog n)-time, or even faster---for a number of fundamental graph problems in the massively parallel computation (MPC) model. This model is a widely-adopted theoretical abstraction of MapReduce style settings, where a number of machines communicate in an all-to-all manner to process large-scale data. Contributing to this line of work on MPC graph algorithms, we present poly(łog k) ε poly(łogłog n) round MPC algorithms for computing O(k^1+o(1) )-spanners in the strongly sublinear regime of local memory. To the best of our knowledge, these are the first sublogarithmic-time MPC algorithms for spanner construction. As primary applications of our spanners, we get two important implications, as follows: -For the MPC setting, we get an O(łog^2łog n)-round algorithm for O(łog^1+o(1) n) approximation of all pairs shortest paths (APSP) in the near-linear regime of local memory. To the best of our knowledge, this is the first sublogarithmic-time MPC algorithm for distance approximations. -Our result above also extends to the Congested Clique model of distributed computing, with the same round complexity and approximation guarantee. This gives the first sub-logarithmic algorithm for approximating APSP in weighted graphs in the Congested Clique model. 
    more » « less
  3. Over the last two decades, frameworks for distributed-memory parallel computation, such as MapReduce, Hadoop, Spark and Dryad, have gained significant popularity with the growing prevalence of large network datasets. The Massively Parallel Computation (MPC) model is the de-facto standard for studying graph algorithms in these frameworks theoretically. Subgraph counting is one such fundamental problem in analyzing massive graphs, with the main algorithmic challenges centering on designing methods which are both scalable and accurate. Given a graph G = (V, E) with n vertices, m edges and T triangles, our first result is an algorithm that outputs a (1+ε)-approximation to T, with asymptotically optimal round and total space complexity provided any S ≥ max{(√ m, n²/m)} space per machine and assuming T = Ω(√{m/n}). Our result gives a quadratic improvement on the bound on T over previous works. We also provide a simple extension of our result to counting any subgraph of k size for constant k ≥ 1. Our second result is an O_δ(log log n)-round algorithm for exactly counting the number of triangles, whose total space usage is parametrized by the arboricity α of the input graph. We extend this result to exactly counting k-cliques for any constant k. Finally, we prove that a recent result of Bera, Pashanasangi and Seshadhri (ITCS 2020) for exactly counting all subgraphs of size at most 5 can be implemented in the MPC model in Õ_δ(√{log n}) rounds, O(n^δ) space per machine and O(mα³) total space. In addition to our theoretical results, we simulate our triangle counting algorithms in real-world graphs obtained from the Stanford Network Analysis Project (SNAP) database. Our results show that both our approximate and exact counting algorithms exhibit improvements in terms of round complexity and approximation ratio, respectively, compared to two previous widely used algorithms for these problems. 
    more » « less
  4. We devise multigrid preconditioners for linear-quadratic space-time distributed parabolic optimal control problems. While our method is rooted in earlier work on elliptic control, the temporal dimension presents new challenges in terms of algorithm design and quality. Our primary focus is on the cG(s)dG(r) discretizations which are based on functions that are continuous in space and discontinuous in time, but our technique is applicable to various other space-time finite element discretizations. We construct and analyse two kinds of multigrid preconditioners: the first is based on full coarsening in space and time, while the second is based on semi-coarsening in space only. Our analysis, in conjunction with numerical experiments, shows that both preconditioners are of optimal order with respect to the discretization in case of cG(1)dG(r) for r = 0, 1 and exhibit a suboptimal behaviour in time for Crank–Nicolson. We also show that, under certain conditions, the preconditioner using full space-time coarsening is more efficient than the one involving semi-coarsening in space, a phenomenon that has not been observed previously. Our numerical results confirm the theoretical findings. 
    more » « less
  5. The Sylvester equation offers a powerful and unifying primitive for a variety of important graph mining tasks, including network alignment, graph kernel, node similarity, subgraph matching, etc. A major bottleneck of Sylvester equation lies in its high computational complexity. Despite tremendous effort, state-of-the-art methods still require a complexity that is at least \em quadratic in the number of nodes of graphs, even with approximations. In this paper, we propose a family of Krylov subspace based algorithms (\fasten) to speed up and scale up the computation of Sylvester equation for graph mining. The key idea of the proposed methods is to project the original equivalent linear system onto a Kronecker Krylov subspace. We further exploit (1) the implicit representation of the solution matrix as well as the associated computation, and (2) the decomposition of the original Sylvester equation into a set of inter-correlated Sylvester equations of smaller size. The proposed algorithms bear two distinctive features. First, they provide the \em exact solutions without any approximation error. Second, they significantly reduce the time and space complexity for solving Sylvester equation, with two of the proposed algorithms having a \em linear complexity in both time and space. Experimental evaluations on a diverse set of real networks, demonstrate that our methods (1) are up to $10,000\times$ faster against Conjugate Gradient method, the best known competitor that outputs the exact solution, and (2) scale up to million-node graphs. 
    more » « less