skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: I/O-Efficient Algorithms for Topological Sort and Related Problems
This article presents I/O-efficient algorithms for topologically sorting a directed acyclic graph and for the more general problem identifying and topologically sorting the strongly connected components of a directed graph G = ( V, E ). Both algorithms are randomized and have I/O-costs O ( sort ( E ) · poly(log V)), with high probability, where sort ( E ) = O( E / B log M / B ( E/B )) is the I/O cost of sorting an | E |-element array on a machine with size- B blocks and size- M cache/internal memory. These are the first algorithms for these problems that do not incur at least one I/O per vertex, and as such these are the first I/O-efficient algorithms for sparse graphs. By applying the technique of time-forward processing, these algorithms also imply I/O-efficient algorithms for most problems on directed acyclic graphs, such as shortest paths, as well as the single-source reachability problem on arbitrary directed graphs.  more » « less
Award ID(s):
2106759 1918989
PAR ID:
10407974
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
ACM Transactions on Algorithms
Volume:
18
Issue:
1
ISSN:
1549-6325
Page Range / eLocation ID:
1 to 24
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Berenbrink, Petra and (Ed.)
    A directed acyclic graph G = (V,E) is said to be (e,d)-depth robust if for every subset S ⊆ V of |S| ≤ e nodes the graph G-S still contains a directed path of length d. If the graph is (e,d)-depth-robust for any e,d such that e+d ≤ (1-ε)|V| then the graph is said to be ε-extreme depth-robust. In the field of cryptography, (extremely) depth-robust graphs with low indegree have found numerous applications including the design of side-channel resistant Memory-Hard Functions, Proofs of Space and Replication and in the design of Computationally Relaxed Locally Correctable Codes. In these applications, it is desirable to ensure the graphs are locally navigable, i.e., there is an efficient algorithm GetParents running in time polylog|V| which takes as input a node v ∈ V and returns the set of v’s parents. We give the first explicit construction of locally navigable ε-extreme depth-robust graphs with indegree O(log |V|). Previous constructions of ε-extreme depth-robust graphs either had indegree ω̃(log² |V|) or were not explicit. 
    more » « less
  2. Given a directed acyclic graph (DAG) G=(V,E), we say that G is (e,d)-depth-robust (resp. (e,d)-edge-depth-robust) if for any set S⊆V (resp. S⊆E) of at most |S|≤e nodes (resp. edges) the graph G−S contains a directed path of length d. While edge-depth-robust graphs are potentially easier to construct, many applications in cryptography require node depth-robust graphs with small indegree. We create a graph reduction that transforms an (e,d)-edge-depth-robust graph with m edges into a (e/2,d)-depth-robust graph with O(m) nodes and constant indegree. One immediate consequence of this result is the first construction of a provably (nloglognlogn,nlogn(logn)loglogn)-depth-robust graph with constant indegree. Our reduction crucially relies on ST-robust graphs, a new graph property we introduce which may be of independent interest. We say that a directed, acyclic graph with n inputs and n outputs is (k1,k2)-ST-robust if we can remove any k1 nodes and there exists a subgraph containing at least k2 inputs and k2 outputs such that each of the k2 inputs is connected to all of the k2 outputs. If the graph if (k1,n−k1)-ST-robust for all k1≤n we say that the graph is maximally ST-robust. We show how to construct maximally ST-robust graphs with constant indegree and O(n) nodes. Given a family M of ST-robust graphs and an arbitrary (e,d)-edge-depth-robust graph G we construct a new constant-indegree graph Reduce(G,M) by replacing each node in G with an ST-robust graph from M. We also show that ST-robust graphs can be used to construct (tight) proofs-of-space and (asymptotically) improved wide-block labeling functions. 
    more » « less
  3. Parallel radix sorting has been extensively studied in the literature for decades. However, the most efficient implementations require auxiliary memory proportional to the input size, which can be prohibitive for large inputs. The standard serial in-place radix sorting algorithm is based on swapping each key to its correct place in the array on each level of recursion. However, it is not straightforward to parallelize this algorithm due to dependencies among the swaps. This paper presents Regions Sort, a new parallel in-place radix sorting algorithm that is efficient both in theory and in practice. Our algorithm uses a graph structure to model dependencies across elements that need to be swapped, and generates independent tasks from this graph that can be executed in parallel. For sorting $$n$$ integers from a range $$r$$, and a parameter $$K$$, Regions Sort requires only $$O(K\log r\log n)$$ auxiliary memory. Our algorithm requires $$O(n\log r)$$ work and $$O((n/K+\log K)\log r)$$ span, making it work-efficient and highly parallel. In addition, we present several optimizations that significantly improve the empirical performance of our algorithm. We compare the performance of Regions Sort to existing parallel in-place and out-of-place sorting algorithms on a variety of input distributions and show that Regions Sort is faster than optimized out-of-place radix sorting and comparison sorting algorithms, and is almost always faster than the fastest publicly-available in-place sorting algorithm. 
    more » « less
  4. Over the last two decades, frameworks for distributed-memory parallel computation, such as MapReduce, Hadoop, Spark and Dryad, have gained significant popularity with the growing prevalence of large network datasets. The Massively Parallel Computation (MPC) model is the de-facto standard for studying graph algorithms in these frameworks theoretically. Subgraph counting is one such fundamental problem in analyzing massive graphs, with the main algorithmic challenges centering on designing methods which are both scalable and accurate. Given a graph G = (V, E) with n vertices, m edges and T triangles, our first result is an algorithm that outputs a (1+ε)-approximation to T, with asymptotically optimal round and total space complexity provided any S ≥ max{(√ m, n²/m)} space per machine and assuming T = Ω(√{m/n}). Our result gives a quadratic improvement on the bound on T over previous works. We also provide a simple extension of our result to counting any subgraph of k size for constant k ≥ 1. Our second result is an O_δ(log log n)-round algorithm for exactly counting the number of triangles, whose total space usage is parametrized by the arboricity α of the input graph. We extend this result to exactly counting k-cliques for any constant k. Finally, we prove that a recent result of Bera, Pashanasangi and Seshadhri (ITCS 2020) for exactly counting all subgraphs of size at most 5 can be implemented in the MPC model in Õ_δ(√{log n}) rounds, O(n^δ) space per machine and O(mα³) total space. In addition to our theoretical results, we simulate our triangle counting algorithms in real-world graphs obtained from the Stanford Network Analysis Project (SNAP) database. Our results show that both our approximate and exact counting algorithms exhibit improvements in terms of round complexity and approximation ratio, respectively, compared to two previous widely used algorithms for these problems. 
    more » « less
  5. We give an algorithm that computes exact maximum flows and minimum-cost flows on directed graphs with m edges and polynomially bounded integral demands, costs, and capacities in m^{1+o(1)} time. Our algorithm builds the flow through a sequence of m^{1+o(1)} approximate undirected minimum-ratio cycles, each of which is computed and processed in amortized m^{o(1)} time using a new dynamic graph data structure. Our framework extends to algorithms running in m^{1+o(1)} time for computing flows that minimize general edge-separable convex functions to high accuracy. This gives almost-linear time algorithms for several problems including entropy-regularized optimal transport, matrix scaling, p-norm flows, and p-norm isotonic regression on arbitrary directed acyclic graphs. 
    more » « less