skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Streaming Verification for Graph Problems: Optimal Tradeoffs and Nonlinear Sketches
We study graph computations in an enhanced data streaming setting, where a space-bounded client reading the edge stream of a massive graph may delegate some of its work to a cloud service. We seek algorithms that allow the client to verify a purported proof sent by the cloud service that the work done in the cloud is correct. A line of work starting with Chakrabarti et al. (ICALP 2009) has provided such algorithms, which we call schemes, for several statistical and graph-theoretic problems, many of which exhibit a tradeoff between the length of the proof and the space used by the streaming verifier. This work designs new schemes for a number of basic graph problems - including triangle counting, maximum matching, topological sorting, and single-source shortest paths - where past work had either failed to obtain smooth tradeoffs between these two key complexity measures or only obtained suboptimal tradeoffs. Our key innovation is having the verifier compute certain nonlinear sketches of the input stream, leading to either new or improved tradeoffs. In many cases, our schemes, in fact, provide optimal tradeoffs up to logarithmic factors. Specifically, for most graph problems that we study, it is known that the product of the verifier’s space cost v and the proof length h must be at least Omega(n^2) for n-vertex graphs. However, matching upper bounds are only known for a handful of settings of h and v on the curve h*v = ~Theta(n^2). For example, for counting triangles and maximum matching, schemes with costs lying on this curve are only known for (h = ~O(n²), v = ~O(1)), (h = ~O(n), v = ~O(n)), and the trivial (h = ~O(1), v = ~O(n²)). A major message of this work is that by exploiting nonlinear sketches, a significant "portion" of costs on the tradeoff curve h*v=n^2 can be achieved.  more » « less
Award ID(s):
1918989 1845125 1907738
PAR ID:
10193031
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Leibniz international proceedings in informatics
Volume:
176
ISSN:
1868-8969
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Meka, Raghu (Ed.)
    We consider the problem of finding a minimum cut of a weighted graph presented as a single-pass stream. While graph sparsification in streams has been intensively studied, the specific application of finding minimum cuts in streams is less well-studied. To this end, we show upper and lower bounds on minimum cut problems in insertion-only streams for a variety of settings, including for both randomized and deterministic algorithms, for both arbitrary and random order streams, and for both approximate and exact algorithms. One of our main results is an Õ(n/ε) space algorithm with fast update time for approximating a spectral cut query with high probability on a stream given in an arbitrary order. Our result breaks the Ω(n/ε²) space lower bound required of a sparsifier that approximates all cuts simultaneously. Using this result, we provide streaming algorithms with near optimal space of Õ(n/ε) for minimum cut and approximate all-pairs effective resistances, with matching space lower-bounds. The amortized update time of our algorithms is Õ(1), provided that the number of edges in the input graph is at least (n/ε²)^{1+o(1)}. We also give a generic way of incorporating sketching into a recursive contraction algorithm to improve the post-processing time of our algorithms. In addition to these results, we give a random-order streaming algorithm that computes the exact minimum cut on a simple, unweighted graph using Õ(n) space. Finally, we give an Ω(n/ε²) space lower bound for deterministic minimum cut algorithms which matches the best-known upper bound up to polylogarithmic factors. 
    more » « less
  2. We revisit the much-studied problem of space-efficiently estimating the number of triangles in a graph stream, and extensions of this problem to counting fixed-sized cliques and cycles, obtaining a number of new upper and lower bounds. For the important special case of counting triangles, we give a $$4$$-pass, $$(1\pm\varepsilon)$$-approximate, randomized algorithm that needs at most $$\widetilde{O}(\varepsilon^{-2}\cdot m^{3/2}/T)$$ space, where $$m$$ is the number of edges and $$T$$ is a promised lower bound on the number of triangles. This matches the space bound of a very recent algorithm (McGregor et al., PODS 2016), with an arguably simpler and more general technique. We give an improved multi-pass lower bound of $$\Omega(\min\{m^{3/2}/T, m/\sqrt{T}\})$$, applicable at essentially all densities $$\Omega(n) \le m \le O(n^2)$$. We also prove other multi-pass lower bounds in terms of various structural parameters of the input graph. Together, our results resolve a couple of open questions raised in recent work (Braverman et al., ICALP 2013). Our presentation emphasizes more general frameworks, for both upper and lower bounds. We give a sampling algorithm for counting arbitrary subgraphs and then improve it via combinatorial means in the special cases of counting odd cliques and odd cycles. Our results show that these problems are considerably easier in the cash-register streaming model than in the turnstile model, where previous work had focused (Manjunath et al., ESA 2011; Kane et al., ICALP 2012). We use Tur{\'a}n graphs and related gadgets to derive lower bounds for counting cliques and cycles, with triangle-counting lower bounds following as a corollary. 
    more » « less
  3. We consider the problem of preprocessing a weighted directed planar graph in order to quickly answer exact distance queries. The main tension in this problem is between space S and query time Q , and since the mid-1990s all results had polynomial time-space tradeoffs, e.g., Q = ~ Θ( n/√ S ) or Q = ~Θ( n 5/2 /S 3/2 ). In this article we show that there is no polynomial tradeoff between time and space and that it is possible to simultaneously achieve almost optimal space n 1+ o (1) and almost optimal query time n o (1) . More precisely, we achieve the following space-time tradeoffs: n 1+ o (1) space and log 2+ o (1) n query time, n log 2+ o (1) n space and n o (1) query time, n 4/3+ o (1) space and log 1+ o (1) n query time. We reduce a distance query to a variety of point location problems in additively weighted Voronoi diagrams and develop new algorithms for the point location problem itself using several partially persistent dynamic tree data structures. 
    more » « less
  4. Graph coloring is a fundamental problem with wide reaching applications in various areas including ata mining and databases, e.g., in parallel query optimization. In recent years, there has been a growing interest in solving various graph coloring problems in the streaming model. The initial algorithms in this line of work are all crucially randomized, raising natural questions about how important a role randomization plays in streaming graph coloring. A couple of very recent works prove that deterministic or even adversarially robust coloring algorithms (that work on streams whose updates may depend on the algorithm's past outputs) are considerably weaker than standard randomized ones. However, there is still a significant gap between the upper and lower bounds for the number of colors needed (as a function of the maximum degree Δ) for robust coloring and multipass deterministic coloring. We contribute to this line of work by proving the following results. In the deterministic semi-streaming (i.e., O(n · polylog n) space) regime, we present an algorithm that achieves a combinatorially optimal (Δ+1)-coloring using O(logΔ log logΔ) passes. This improves upon the prior O(Δ)-coloring algorithm of Assadi, Chen, and Sun (STOC 2022) at the cost of only an O(log logΔ) factor in the number of passes. In the adversarially robust semi-streaming regime, we design an O(Δ5/2)-coloring algorithm that improves upon the previously best O(Δ3)-coloring algorithm of Chakrabarti, Ghosh, and Stoeckl (ITCS 2022). Further, we obtain a smooth colors/space tradeoff that improves upon another algorithm of the said work: whereas their algorithm uses O(Δ2) colors and O(nΔ1/2) space, ours, in particular, achieves (i)~O(Δ2) colors in O(nΔ1/3) space, and (ii)~O(Δ7/4) colors in O(nΔ1/2) space. 
    more » « less
  5. We give an $$\widetilde{O}(\sqrt{n})$$-space single-pass 0.483-approximation streaming algorithm for estimating the maximum directed cut size (Max-DICUT) in a directed graph on n vertices. This improves over an $$O(\log n)$$-space $4 / 9 < 0.45$ approximation algorithm due to Chou, Golovnev, and Velusamy (FOCS 2020), which was known to be optimal for $$o(\sqrt{n})$$-space algorithms. Max-DICUT is a special case of a constraint satisfaction problem (CSP). In this broader context, we give the first CSP for which algorithms with $$\widetilde{O}(\sqrt{n})$$- space can provably outperform $$o(\sqrt{n})$$- space algorithms. The key technical contribution of our work is development of the notions of a first-order snapshot of a (directed) graph and of estimates of such snapshots. These snapshots can be used to simulate certain (non-streaming) Max-DICUT algorithms, including the “oblivious” algorithms introduced by Feige and Jozeph (Algorithmica, 2015), who showed that one such algorithm Previous work of the authors (SODA 2023) studied the restricted case of bounded-degree graphs, and observed that in this setting, it is straightforward to estimate the snapshot with $$\ell_{1}$$ errors and this suffices to simulate oblivious algorithms. But for unbounded-degree graphs, even defining an achievable and sufficient notion of estimation is subtle. We describe a new notion of snapshot estimation and prove its sufficiency using careful smoothing techniques, and then develop an algorithm which sketches such an estimate via a delicate process of intertwined vertex- and edge-subsampling. Prior to our work, the only streaming algorithms for any CSP on general instances were based on generalizations of the $$O(\log n)$$-space algorithm for Max-DICUT, and can roughly be characterized as based on “zeroth” order snapshots. Our work thus opens the possibility of a new class of algorithms for approximating CSPs by demonstrating that more sophisticated snapshots can outperform cruder ones in the case of Max-DICUT. 
    more » « less