We study several fundamental problems in the kmachine model, a messagepassing model for largescale distributed computations where k ≥ 2 machines jointly perform computations on a large input of size N, (typically, N ≫ k). The input is initially partitioned (randomly or in a balanced fashion) among the k machines, a common implementation in many realworld systems. Communication is pointtopoint, and the goal is to minimize the number of communication rounds of the computation. Our main result is a general technique for designing efficient deterministic distributed algorithms in the kmachine model using PRAM algorithms. Our technique works by efficiently simulating PRAM algorithms in the kmachine model in a deterministic way. This simulation allows us to arrive at new algorithms in the kmachine model for some problems for which no efficient kmachine algorithms are known before and also improve on existing results in the kmachine model for some problems. While our simulation allows us to obtain kmachine algorithms for any problem with a known PRAM algorithm, we mainly focus on graph problems. For an input graph on n vertices and m edges, we obtain Õ(m/k 2 ) round 4 algorithms for various graph problems such as rconnectivity for r = 1,more »
On the Distributed Complexity of LargeScale Graph Computations
Motivated by the increasing need to understand the distributed algorithmic foundations of largescale graph computations, we study some fundamental graph problems in a messagepassing model for distributed computing where k ≥ 2 machines jointly perform computations on graphs with n nodes (typically, n >> k). The input graph is assumed to be initially randomly partitioned among the k machines, a common implementation in many realworld systems. Communication is pointtopoint, and the goal is to minimize the number of communication rounds of the computation. Our main contribution is the General Lower Bound Theorem , a theorem that can be used to show nontrivial lower bounds on the round complexity of distributed largescale data computations. This result is established via an informationtheoretic approach that relates the round complexity to the minimal amount of information required by machines to solve the problem. Our approach is generic, and this theorem can be used in a “cookbook” fashion to show distributed lower bounds for several problems, including nongraph problems. We present two applications by showing (almost) tight lower bounds on the round complexity of two fundamental graph problems, namely, PageRank computation and triangle enumeration . These applications show that our approach can yield lower bounds more »
 Award ID(s):
 1633720
 Publication Date:
 NSFPAR ID:
 10297340
 Journal Name:
 ACM Transactions on Parallel Computing
 Volume:
 8
 Issue:
 2
 Page Range or eLocationID:
 1 to 28
 ISSN:
 23294949
 Sponsoring Org:
 National Science Foundation
More Like this


We introduce an abstract and strong model of massively parallel computation, where essentially the only restrictions are that the “fanin” of each machine is limited to s bits, where s is smaller than the input size n, and that computation proceeds in synchronized rounds, with no communication between different machines within a round. Lower bounds on round complexity in this model apply to every computing platform that shares the most basic design principles of MapReducetype systems. We apply a variant of the “polynomial method” to capture restrictions obeyed by all such massively parallel computations. This connection allows us to translate a lower bound on the (approximate) polynomial degree of a Boolean function to a lower bound on the round complexity of every (randomized) massively parallel computation of that function. These lower bounds apply even in the “unbounded width” version of our model, where the number of machines can be arbitrarily large. As one example of our general results, computing any nontrivial monotone graph property — such as any of the standard connectivity problems — requires a superconstant number of rounds when every machine can accept only a subpolynomial (in n) number of input bits s. This lower bound constitutes significantmore »

We study the communication cost (or message complexity) of fundamental distributed symmetry breaking problems, namely, coloring and MIS. While significant progress has been made in understanding and improving the running time of such problems, much less is known about the message complexity of these problems. In fact, all known algorithms need at least Ω(m) communication for these problems, where m is the number of edges in the graph. We addressthe following question in this paper: can we solve problems such as coloring and MIS using sublinear, i.e., o(m) communication, and if sounder what conditions? In a classical result, Awerbuch, Goldreich, Peleg, and Vainish [JACM 1990] showed that fundamental global problems such asbroadcast and spanning tree construction require at least o(m) messages in the KT1 Congest model (i.e., Congest model in which nodes have initial knowledge of the neighbors' ID's) when algorithms are restricted to be comparisonbased (i.e., algorithms inwhich node ID's can only be compared). Thirty five years after this result, King, Kutten, and Thorup [PODC 2015] showed that onecan solve the above problems using Õ(n) messages (n is the number of nodes in the graph) in Õ(n) rounds in the KT1 Congest model if noncomparisonbased algorithms are permitted. Anmore »

The Sparsest Cut is a fundamental optimization problem that have been extensively studied. For planar inputs the problem is in P and can be solved in Õ(n 3 ) time if all vertex weights are 1. Despite a significant amount of effort, the best algorithms date back to the early 90’s and can only achieve O(log n)approximation in Õ(n) time or 3.5approximation in Õ(n 2 ) time [Rao, STOC92]. Our main result is an Ω(n 2−ε ) lower bound for Sparsest Cut even in planar graphs with unit vertex weights, under the (min, +)Convolution conjecture, showing that approxima tions are inevitable in the nearlinear time regime. To complement the lower bound, we provide a 3.3approximation in nearlinear time, improving upon the 25year old result of Rao in both time and accuracy. We also show that our lower bound is not far from optimal by observing an exact algorithm with running time Õ(n 5/2 ) improving upon the Õ(n 3 ) algorithm of Park and Phillips [STOC93]. Our lower bound accomplishes a repeatedly raised challenge by being the first finegrained lower bound for a natural planar graph problem in P. Building on our construction we prove nearquadratic lower bounds under SETHmore »

We study smoothed analysis of distributed graph algorithms, focusing on the fundamental minimum spanning tree (MST) problem. With the goal of studying the time complexity of distributed MST as a function of the "perturbation" of the input graph, we posit a smoothing model that is parameterized by a smoothing parameter 0 ≤ ϵ(n) ≤ 1 which controls the amount of random edges that can be added to an input graph G per round. Informally, ϵ(n) is the probability (typically a small function of n, e.g., n¼) that a random edge can be added to a node per round. The added random edges, once they are added, can be used (only) for communication. We show upper and lower bounds on the time complexity of distributed MST in the above smoothing model. We present a distributed algorithm that, with high probability, 1 computes an MST and runs in Õ(min{1/√ϵ(n)2O(√log n), D+ √n}) rounds2 where ϵ is the smoothing parameter, D is the network diameter and n is the network size. To complement our upper bound, we also show a lower bound of Ω(min{1/√ϵ(n), D + √n}). We note that the upper and lower bounds essentially match except for a multiplicative 2O(√log n)more »