skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Near-Optimal Hypergraph Sparsification in Insertion-Only and Bounded-Deletion Streams
We study the problem of constructing hypergraph cut sparsifiers in the streaming model where a hypergraph on n vertices is revealed either via an arbitrary sequence of hyperedge insertions alone (insertion-only streaming model) or via an arbitrary sequence of hyperedge insertions and deletions (dynamic streaming model). For any ε ∈ (0,1), a (1 ± ε) hypergraph cut-sparsifier of a hypergraph H is a reweighted subgraph H' whose cut values approximate those of H to within a (1 ± ε) factor. Prior work shows that in the static setting, one can construct a (1 ± ε) hypergraph cut-sparsifier using Õ(nr/ε²) bits of space [Chen-Khanna-Nagda FOCS 2020], and in the setting of dynamic streams using Õ(nrlog m/ε²) bits of space [Khanna-Putterman-Sudan FOCS 2024]; here the Õ notation hides terms that are polylogarithmic in n, and we use m to denote the total number of hyperedges in the hypergraph. Up until now, the best known space complexity for insertion-only streams has been the same as that for the dynamic streams. This naturally poses the question of understanding the complexity of hypergraph sparsification in insertion-only streams. Perhaps surprisingly, in this work we show that in insertion-only streams, a (1 ± ε) cut-sparsifier can be computed in Õ(nr/ε²) bits of space, matching the complexity of the static setting. As a consequence, this also establishes an Ω(log m) factor separation between the space complexity of hypergraph cut sparsification in insertion-only streams and dynamic streams, as the latter is provably known to require Ω(nr log m) bits of space. To better explain this gap, we then show a more general result: namely, if the stream has at most k hyperedge deletions then Õ(n r log k/ε²) bits of space suffice for hypergraph cut sparsification. Thus the space complexity smoothly interpolates between the insertion-only regime (k = 0) and the fully dynamic regime (k = m). Our algorithmic results are driven by a key technical insight: once sufficiently many hyperedges have been inserted into the stream (relative to the number of allowed deletions), we can significantly reduce the underlying hypergraph by size by irrevocably contracting large subsets of vertices. Finally, we complement this result with an essentially matching lower bound of Ω(n r log(k/n)) bits, thus providing essentially a tight characterization of the space complexity for hypergraph cut-sparsification across a spectrum of streaming models.  more » « less
Award ID(s):
2152413
PAR ID:
10671726
Author(s) / Creator(s):
; ;
Editor(s):
Censor-Hillel, Keren; Grandoni, Fabrizio; Ouaknine, Joel; Puppis, Gabriele
Publisher / Repository:
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Date Published:
Volume:
334
ISSN:
1868-8969
Page Range / eLocation ID:
108:1-108:11
Subject(s) / Keyword(s):
Sparsification sketching hypergraphs Theory of computation → Sketching and sampling
Format(s):
Medium: X Size: 11 pages; 730824 bytes Other: application/pdf
Size(s):
11 pages 730824 bytes
Sponsoring Org:
National Science Foundation
More Like this
  1. Bringmann, Karl; Grohe, Martin; Puppis, Gabriele; Svensson, Ola (Ed.)
    Recently, a number of variants of the notion of cut-preserving hypergraph sparsification have been studied in the literature. These variants include directed hypergraph sparsification, submodular hypergraph sparsification, general notions of approximation including spectral approximations, and more general notions like sketching that can answer cut queries using more general data structures than just sparsifiers. In this work, we provide reductions between these different variants of hypergraph sparsification and establish new upper and lower bounds on the space complexity of preserving their cuts. Specifically, we show that: 1) (1 ± ε) directed hypergraph spectral (respectively cut) sparsification on n vertices efficiently reduces to (1 ± ε) undirected hypergraph spectral (respectively cut) sparsification on n² + 1 vertices. Using the work of Lee and Jambulapati, Liu, and Sidford (STOC 2023) this gives us directed hypergraph spectral sparsifiers with O(n² log²(n) / ε²) hyperedges and directed hypergraph cut sparsifiers with O(n² log(n)/ ε²) hyperedges by using the work of Chen, Khanna, and Nagda (FOCS 2020), both of which improve upon the work of Oko, Sakaue, and Tanigawa (ICALP 2023). 2) Any cut sketching scheme which preserves all cuts in any directed hypergraph on n vertices to a (1 ± ε) factor (for ε = 1/(2^{O(√{log(n)})})) must have worst-case bit complexity n^{3 - o(1)}. Because directed hypergraphs are a subclass of submodular hypergraphs, this also shows a worst-case sketching lower bound of n^{3 - o(1)} bits for sketching cuts in general submodular hypergraphs. 3) (1 ± ε) monotone submodular hypergraph cut sparsification on n vertices efficiently reduces to (1 ± ε) symmetric submodular hypergraph sparsification on n+1 vertices. Using the work of Jambulapati et. al. (FOCS 2023) this gives us monotone submodular hypergraph sparsifiers with Õ(n / ε²) hyperedges, improving on the O(n³ / ε²) hyperedge bound of Kenneth and Krauthgamer (arxiv 2023). At a high level, our results use the same general principle, namely, by showing that cuts in one class of hypergraphs can be simulated by cuts in a simpler class of hypergraphs, we can leverage sparsification results for the simpler class of hypergraphs. 
    more » « less
  2. Recently, a number of variants of the notion of cut-preserving hypergraph sparsification have been studied in the literature. These variants include directed hypergraph sparsification, submodular hypergraph sparsification, general notions of approximation including spectral approximations, and more general notions like sketching that can answer cut queries using more general data structures than just sparsifiers. In this work, we provide reductions between these different variants of hypergraph sparsification and establish new upper and lower bounds on the space complexity of preserving their cuts. Specifically, we show that: 1) (1 ± ε) directed hypergraph spectral (respectively cut) sparsification on n vertices efficiently reduces to (1 ± ε) undirected hypergraph spectral (respectively cut) sparsification on n² + 1 vertices. Using the work of Lee and Jambulapati, Liu, and Sidford (STOC 2023) this gives us directed hypergraph spectral sparsifiers with O(n² log²(n) / ε²) hyperedges and directed hypergraph cut sparsifiers with O(n² log(n)/ ε²) hyperedges by using the work of Chen, Khanna, and Nagda (FOCS 2020), both of which improve upon the work of Oko, Sakaue, and Tanigawa (ICALP 2023). 2) Any cut sketching scheme which preserves all cuts in any directed hypergraph on n vertices to a (1 ± ε) factor (for ε = 1/(2^{O(√{log(n)})})) must have worst-case bit complexity n^{3 - o(1)}. Because directed hypergraphs are a subclass of submodular hypergraphs, this also shows a worst-case sketching lower bound of n^{3 - o(1)} bits for sketching cuts in general submodular hypergraphs. 3) (1 ± ε) monotone submodular hypergraph cut sparsification on n vertices efficiently reduces to (1 ± ε) symmetric submodular hypergraph sparsification on n+1 vertices. Using the work of Jambulapati et. al. (FOCS 2023) this gives us monotone submodular hypergraph sparsifiers with Õ(n / ε²) hyperedges, improving on the O(n³ / ε²) hyperedge bound of Kenneth and Krauthgamer (arxiv 2023). At a high level, our results use the same general principle, namely, by showing that cuts in one class of hypergraphs can be simulated by cuts in a simpler class of hypergraphs, we can leverage sparsification results for the simpler class of hypergraphs. 
    more » « less
  3. Recently, a number of variants of the notion of cut-preserving hypergraph sparsification have been studied in the literature. These variants include directed hypergraph sparsification, submodular hypergraph sparsification, general notions of approximation including spectral approximations, and more general notions like sketching that can answer cut queries using more general data structures than just sparsifiers. In this work, we provide reductions between these different variants of hypergraph sparsification and establish new upper and lower bounds on the space complexity of preserving their cuts. Specifically, we show that: 1) (1 ± ε) directed hypergraph spectral (respectively cut) sparsification on n vertices efficiently reduces to (1 ± ε) undirected hypergraph spectral (respectively cut) sparsification on n² + 1 vertices. Using the work of Lee and Jambulapati, Liu, and Sidford (STOC 2023) this gives us directed hypergraph spectral sparsifiers with O(n² log²(n) / ε²) hyperedges and directed hypergraph cut sparsifiers with O(n² log(n)/ ε²) hyperedges by using the work of Chen, Khanna, and Nagda (FOCS 2020), both of which improve upon the work of Oko, Sakaue, and Tanigawa (ICALP 2023). 2) Any cut sketching scheme which preserves all cuts in any directed hypergraph on n vertices to a (1 ± ε) factor (for ε = 1/(2^{O(√{log(n)})})) must have worst-case bit complexity n^{3 - o(1)}. Because directed hypergraphs are a subclass of submodular hypergraphs, this also shows a worst-case sketching lower bound of n^{3 - o(1)} bits for sketching cuts in general submodular hypergraphs. 3) (1 ± ε) monotone submodular hypergraph cut sparsification on n vertices efficiently reduces to (1 ± ε) symmetric submodular hypergraph sparsification on n+1 vertices. Using the work of Jambulapati et. al. (FOCS 2023) this gives us monotone submodular hypergraph sparsifiers with Õ(n / ε²) hyperedges, improving on the O(n³ / ε²) hyperedge bound of Kenneth and Krauthgamer (arxiv 2023). At a high level, our results use the same general principle, namely, by showing that cuts in one class of hypergraphs can be simulated by cuts in a simpler class of hypergraphs, we can leverage sparsification results for the simpler class of hypergraphs. 
    more » « less
  4. A $$(1\pm\epsilon)$$ -sparsifier of a hypergraph $G(V, E)$ is a (weighted) subgraph that preserves the value of every cut to within a $$(1\pm\epsilon)$$ -factor. It is known that every hypergraph with $$n$$ vertices admits a $$(1 \pm \epsilon)$$ -sparsifier with $$\tilde{O}(n/{\epsilon}^{2})$$ hyperedges. In this work, we explore the task of building such a sparsifier by using only linear measurements (a linear sketch) over the hyperedges of $$G$$, and provide nearly-matching upper and lower bounds for this task. Specifically, we show that there is a randomized linear sketch of size $$\tilde{O}(nr\log(m)/\epsilon^{2})$$ bits which with high probability contains sufficient information to recover a $$(1\pm\epsilon)$$ cut-sparsifier with $$\tilde{O}(n/\epsilon^{2})$$ hyperedges for any hypergraph with at most $$m$$ edges each of which has arity bounded by $$r$$. This immediately gives a dynamic streaming algorithm for hypergraph cut sparsification with an identical space complexity, improving on the previous best known bound of $$\tilde{O}(nr^{2}\log^{4}({m})/\epsilon^{2})$$ bits of space (Guha, McGregor, and Tench, PODS 2015). We complement our algorithmic result above with a nearly-matching lower bound. We show that for every $$\epsilon\in(0,1)$$, one needs $$\Omega(nr\log(m/n)/\log(n))$$ bits to construct a $$(1\pm\epsilon)$$ -sparsifier via linear sketching, thus showing that our linear sketch achieves an optimal dependence on both $$r$$ and $$\log(m)$$. The starting point for our improved algorithm is importance sampling of hyperedges based on the new notion of $$k$$ -cut strength introduced in the recent work of Quanrud (SODA 2024). The natural algorithm based on this concept leads to $$\log m$$ levels of sampling where errors can potentially accumulate, and this accounts for the polylog $(m)$ losses in the sketch size of the natural algorithm. We develop a more intricate analysis of the accumulation in error to show most levels do not contribute to the error and actual loss is only polylog $(n)$. Combining with careful preprocessing (and analysis) this enables us to get rid of all extraneous $$\log m$$ factors in the sketch size, but the quadratic dependence on $$r$$ remains. This dependence originates from use of correlated $$\ell_{0}$$ -samplers to recover a large number of low-strength edges in a hypergraph simultaneously by looking at neighborhoods of individual vertices. In graphs, this leads to discovery of $$\Omega(n)$$ edges in a single shot, whereas in hypergraphs, this may potentially only reveal $$O$$($$n$$/$$r$$) new edges, thus requiring $$\Omega(r)$$ rounds of recovery. To remedy this we introduce a new technique of random fingerprinting of hyperedges which effectively eliminates the correlations created by large arity hyperedges, and leads to a scheme for recovering hyperedges of low strength with an optimal dependence on $$r$$. Putting all these ingredients together yields our linear sketching algorithm. Our lower bound is established by a reduction from the universal relation problem in the one-way communication setting. 
    more » « less
  5. A $$(1\pm\epsilon)$$ -sparsifier of a hypergraph $G(V, E)$ is a (weighted) subgraph that preserves the value of every cut to within a $$(1\pm\epsilon)$$ -factor. It is known that every hypergraph with $$n$$ vertices admits a $$(1 \pm \epsilon)$$ -sparsifier with $$\tilde{O}(n/{\epsilon}^{2})$$ hyperedges. In this work, we explore the task of building such a sparsifier by using only linear measurements (a linear sketch) over the hyperedges of $$G$$, and provide nearly-matching upper and lower bounds for this task. Specifically, we show that there is a randomized linear sketch of size $$\tilde{O}(nr\log(m)/\epsilon^{2})$$ bits which with high probability contains sufficient information to recover a $$(1\pm\epsilon)$$ cut-sparsifier with $$\tilde{O}(n/\epsilon^{2})$$ hyperedges for any hypergraph with at most $$m$$ edges each of which has arity bounded by $$r$$. This immediately gives a dynamic streaming algorithm for hypergraph cut sparsification with an identical space complexity, improving on the previous best known bound of $$\tilde{O}(nr^{2}\log^{4}({m})/\epsilon^{2})$$ bits of space (Guha, McGregor, and Tench, PODS 2015). We complement our algorithmic result above with a nearly-matching lower bound. We show that for every $$\epsilon\in(0,1)$$, one needs $$\Omega(nr\log(m/n)/\log(n))$$ bits to construct a $$(1\pm\epsilon)$$ -sparsifier via linear sketching, thus showing that our linear sketch achieves an optimal dependence on both $$r$$ and $$\log(m)$$. The starting point for our improved algorithm is importance sampling of hyperedges based on the new notion of $$k$$ -cut strength introduced in the recent work of Quanrud (SODA 2024). The natural algorithm based on this concept leads to $$\log m$$ levels of sampling where errors can potentially accumulate, and this accounts for the polylog $(m)$ losses in the sketch size of the natural algorithm. We develop a more intricate analysis of the accumulation in error to show most levels do not contribute to the error and actual loss is only polylog $(n)$. Combining with careful preprocessing (and analysis) this enables us to get rid of all extraneous $$\log m$$ factors in the sketch size, but the quadratic dependence on $$r$$ remains. This dependence originates from use of correlated $$\ell_{0}$$ -samplers to recover a large number of low-strength edges in a hypergraph simultaneously by looking at neighborhoods of individual vertices. In graphs, this leads to discovery of $$\Omega(n)$$ edges in a single shot, whereas in hypergraphs, this may potentially only reveal $$O$$($$n$$/$$r$$) new edges, thus requiring $$\Omega(r)$$ rounds of recovery. To remedy this we introduce a new technique of random fingerprinting of hyperedges which effectively eliminates the correlations created by large arity hyperedges, and leads to a scheme for recovering hyperedges of low strength with an optimal dependence on $$r$$. Putting all these ingredients together yields our linear sketching algorithm. Our lower bound is established by a reduction from the universal relation problem in the one-way communication setting. 
    more » « less