Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

null (Ed.)Motivated by the increasing need to understand the distributed algorithmic foundations of largescale graph computations, we study some fundamental graph problems in a messagepassing model for distributed computing where k ≥ 2 machines jointly perform computations on graphs with n nodes (typically, n >> k). The input graph is assumed to be initially randomly partitioned among the k machines, a common implementation in many realworld systems. Communication is pointtopoint, and the goal is to minimize the number of communication rounds of the computation. Our main contribution is the General Lower Bound Theorem , a theorem that can be used to show nontrivial lower bounds on the round complexity of distributed largescale data computations. This result is established via an informationtheoretic approach that relates the round complexity to the minimal amount of information required by machines to solve the problem. Our approach is generic, and this theorem can be used in a “cookbook” fashion to show distributed lower bounds for several problems, including nongraph problems. We present two applications by showing (almost) tight lower bounds on the round complexity of two fundamental graph problems, namely, PageRank computation and triangle enumeration . These applications show that our approach can yield lower bounds for problems where the application of communication complexity techniques seems not obvious or gives weak bounds, including and especially under a stochastic partition of the input. We then present distributed algorithms for PageRank and triangle enumeration with a round complexity that (almost) matches the respective lower bounds; these algorithms exhibit a round complexity that scales superlinearly in k , improving significantly over previous results [Klauck et al., SODA 2015]. Specifically, we show the following results: PageRank: We show a lower bound of Ὼ(n/k 2 ) rounds and present a distributed algorithm that computes an approximation of the PageRank of all the nodes of a graph in Õ(n/k 2 ) rounds. Triangle enumeration: We show that there exist graphs with m edges where any distributed algorithm requires Ὼ(m/k 5/3 ) rounds. This result also implies the first nontrivial lower bound of Ὼ(n 1/3 ) rounds for the congested clique model, which is tight up to logarithmic factors. We then present a distributed algorithm that enumerates all the triangles of a graph in Õ(m/k 5/3 + n/k 4/3 ) rounds.more » « less

In the parallel paging problem, there are p processors that share a cache of size k. The goal is to partition the cache among the processors over time in order to minimize their average completion time. For this longstanding open problem, we give tight upper and lower bounds of ⇥(log p) on the competitive ratio with O(1) resource augmentation. A key idea in both our algorithms and lower bounds is to relate the problem of parallel paging to the seemingly unrelated problem of green paging. In green paging, there is an energyoptimized processor that can temporarily turn off one or more of its cache banks (thereby reducing power consumption), so that the cache size varies between a maximum size k and a minimum size k/p. The goal is to minimize the total energy consumed by the computation, which is proportional to the integral of the cache size over time. We show that any efficient solution to green paging can be converted into an efficient solution to parallel paging, and that any lower bound for green paging can be converted into a lower bound for parallel paging, in both cases in a blackbox fashion. We then show that, with O(1) resource augmentation, the optimal competitive ratio for deterministic online green paging is ⇥(log p), which, in turn, implies the same bounds for deterministic online parallel paging.more » « less

null (Ed.)In the parallel paging problem, there are $\pP$ processors that share a cache of size $k$. The goal is to partition the cache among the \procs over time in order to minimize their average completion time. For this longstanding open problem, we give tight upper and lower bounds of $\Theta(\log \p)$ on the competitive ratio with $O(1)$ resource augmentation. A key idea in both our algorithms and lower bounds is to relate the problem of parallel paging to the seemingly unrelated problem of green paging. In green paging, there is an energyoptimized processor that can temporarily turn off one or more of its cache banks (thereby reducing power consumption), so that the cache size varies between a maximum size $k$ and a minimum size $k/\p$. The goal is to minimize the total energy consumed by the computation, which is proportional to the integral of the cache size over time. We show that any efficient solution to green paging can be converted into an efficient solution to parallel paging, and that any lower bound for green paging can be converted into a lower bound for parallel paging, in both cases in a blackbox fashion. We then show that, with $O(1)$ resource augmentation, the optimal competitive ratio for deterministic online green paging is $\Theta(\log \p)$, which, in turn, implies the same bounds for deterministic online parallel paging.more » « less