skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Counting Simplices in Hypergraph Streams
We consider the problem of space-efficiently estimating the number of simplices in a hypergraph stream. This is the most natural hypergraph generalization of the highly-studied problem of estimating the number of triangles in a graph stream. Our input is a k-uniform hypergraph H with n vertices and m hyperedges, each hyperedge being a k-sized subset of vertices. A k-simplex in H is a subhypergraph on k+1 vertices X such that all k+1 possible hyperedges among X exist in H. The goal is to process the hyperedges of H, which arrive in an arbitrary order as a data stream, and compute a good estimate of T_k(H), the number of k-simplices in H. We design a suite of algorithms for this problem. As with triangle-counting in graphs (which is the special case k = 2), sublinear space is achievable but only under a promise of the form T_k(H) ≥ T. Under such a promise, our algorithms use at most four passes and together imply a space bound of O(ε^{-2} log δ^{-1} polylog n ⋅ min{(m^{1+1/k})/T, m/(T^{2/(k+1)})}) for each fixed k ≥ 3, in order to guarantee an estimate within (1±ε)T_k(H) with probability ≥ 1-δ. We also give a simpler 1-pass algorithm that achieves O(ε^{-2} log δ^{-1} log n⋅ (m/T) (Δ_E + Δ_V^{1-1/k})) space, where Δ_E (respectively, Δ_V) denotes the maximum number of k-simplices that share a hyperedge (respectively, a vertex), which generalizes a previous result for the k = 2 case. We complement these algorithmic results with space lower bounds of the form Ω(ε^{-2}), Ω(m^{1+1/k}/T), Ω(m/T^{1-1/k}) and Ω(mΔ_V^{1/k}/T) for multi-pass algorithms and Ω(mΔ_E/T) for 1-pass algorithms, which show that some of the dependencies on parameters in our upper bounds are nearly tight. Our techniques extend and generalize several different ideas previously developed for triangle counting in graphs, using appropriate innovations to handle the more complicated combinatorics of hypergraphs.  more » « less
Award ID(s):
1907738
PAR ID:
10379938
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Leibniz international proceedings in informatics
Volume:
244
ISSN:
1868-8969
Page Range / eLocation ID:
32:1 - 32:19
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Censor-Hillel, Keren; Grandoni, Fabrizio; Ouaknine, Joel; Puppis, Gabriele (Ed.)
    We study the problem of constructing hypergraph cut sparsifiers in the streaming model where a hypergraph on n vertices is revealed either via an arbitrary sequence of hyperedge insertions alone (insertion-only streaming model) or via an arbitrary sequence of hyperedge insertions and deletions (dynamic streaming model). For any ε ∈ (0,1), a (1 ± ε) hypergraph cut-sparsifier of a hypergraph H is a reweighted subgraph H' whose cut values approximate those of H to within a (1 ± ε) factor. Prior work shows that in the static setting, one can construct a (1 ± ε) hypergraph cut-sparsifier using Õ(nr/ε²) bits of space [Chen-Khanna-Nagda FOCS 2020], and in the setting of dynamic streams using Õ(nrlog m/ε²) bits of space [Khanna-Putterman-Sudan FOCS 2024]; here the Õ notation hides terms that are polylogarithmic in n, and we use m to denote the total number of hyperedges in the hypergraph. Up until now, the best known space complexity for insertion-only streams has been the same as that for the dynamic streams. This naturally poses the question of understanding the complexity of hypergraph sparsification in insertion-only streams. Perhaps surprisingly, in this work we show that in insertion-only streams, a (1 ± ε) cut-sparsifier can be computed in Õ(nr/ε²) bits of space, matching the complexity of the static setting. As a consequence, this also establishes an Ω(log m) factor separation between the space complexity of hypergraph cut sparsification in insertion-only streams and dynamic streams, as the latter is provably known to require Ω(nr log m) bits of space. To better explain this gap, we then show a more general result: namely, if the stream has at most k hyperedge deletions then Õ(n r log k/ε²) bits of space suffice for hypergraph cut sparsification. Thus the space complexity smoothly interpolates between the insertion-only regime (k = 0) and the fully dynamic regime (k = m). Our algorithmic results are driven by a key technical insight: once sufficiently many hyperedges have been inserted into the stream (relative to the number of allowed deletions), we can significantly reduce the underlying hypergraph by size by irrevocably contracting large subsets of vertices. Finally, we complement this result with an essentially matching lower bound of Ω(n r log(k/n)) bits, thus providing essentially a tight characterization of the space complexity for hypergraph cut-sparsification across a spectrum of streaming models. 
    more » « less
  2. Over the last two decades, frameworks for distributed-memory parallel computation, such as MapReduce, Hadoop, Spark and Dryad, have gained significant popularity with the growing prevalence of large network datasets. The Massively Parallel Computation (MPC) model is the de-facto standard for studying graph algorithms in these frameworks theoretically. Subgraph counting is one such fundamental problem in analyzing massive graphs, with the main algorithmic challenges centering on designing methods which are both scalable and accurate. Given a graph G = (V, E) with n vertices, m edges and T triangles, our first result is an algorithm that outputs a (1+ε)-approximation to T, with asymptotically optimal round and total space complexity provided any S ≥ max{(√ m, n²/m)} space per machine and assuming T = Ω(√{m/n}). Our result gives a quadratic improvement on the bound on T over previous works. We also provide a simple extension of our result to counting any subgraph of k size for constant k ≥ 1. Our second result is an O_δ(log log n)-round algorithm for exactly counting the number of triangles, whose total space usage is parametrized by the arboricity α of the input graph. We extend this result to exactly counting k-cliques for any constant k. Finally, we prove that a recent result of Bera, Pashanasangi and Seshadhri (ITCS 2020) for exactly counting all subgraphs of size at most 5 can be implemented in the MPC model in Õ_δ(√{log n}) rounds, O(n^δ) space per machine and O(mα³) total space. In addition to our theoretical results, we simulate our triangle counting algorithms in real-world graphs obtained from the Stanford Network Analysis Project (SNAP) database. Our results show that both our approximate and exact counting algorithms exhibit improvements in terms of round complexity and approximation ratio, respectively, compared to two previous widely used algorithms for these problems. 
    more » « less
  3. Recently, a number of variants of the notion of cut-preserving hypergraph sparsification have been studied in the literature. These variants include directed hypergraph sparsification, submodular hypergraph sparsification, general notions of approximation including spectral approximations, and more general notions like sketching that can answer cut queries using more general data structures than just sparsifiers. In this work, we provide reductions between these different variants of hypergraph sparsification and establish new upper and lower bounds on the space complexity of preserving their cuts. Specifically, we show that: 1) (1 ± ε) directed hypergraph spectral (respectively cut) sparsification on n vertices efficiently reduces to (1 ± ε) undirected hypergraph spectral (respectively cut) sparsification on n² + 1 vertices. Using the work of Lee and Jambulapati, Liu, and Sidford (STOC 2023) this gives us directed hypergraph spectral sparsifiers with O(n² log²(n) / ε²) hyperedges and directed hypergraph cut sparsifiers with O(n² log(n)/ ε²) hyperedges by using the work of Chen, Khanna, and Nagda (FOCS 2020), both of which improve upon the work of Oko, Sakaue, and Tanigawa (ICALP 2023). 2) Any cut sketching scheme which preserves all cuts in any directed hypergraph on n vertices to a (1 ± ε) factor (for ε = 1/(2^{O(√{log(n)})})) must have worst-case bit complexity n^{3 - o(1)}. Because directed hypergraphs are a subclass of submodular hypergraphs, this also shows a worst-case sketching lower bound of n^{3 - o(1)} bits for sketching cuts in general submodular hypergraphs. 3) (1 ± ε) monotone submodular hypergraph cut sparsification on n vertices efficiently reduces to (1 ± ε) symmetric submodular hypergraph sparsification on n+1 vertices. Using the work of Jambulapati et. al. (FOCS 2023) this gives us monotone submodular hypergraph sparsifiers with Õ(n / ε²) hyperedges, improving on the O(n³ / ε²) hyperedge bound of Kenneth and Krauthgamer (arxiv 2023). At a high level, our results use the same general principle, namely, by showing that cuts in one class of hypergraphs can be simulated by cuts in a simpler class of hypergraphs, we can leverage sparsification results for the simpler class of hypergraphs. 
    more » « less
  4. Recently, a number of variants of the notion of cut-preserving hypergraph sparsification have been studied in the literature. These variants include directed hypergraph sparsification, submodular hypergraph sparsification, general notions of approximation including spectral approximations, and more general notions like sketching that can answer cut queries using more general data structures than just sparsifiers. In this work, we provide reductions between these different variants of hypergraph sparsification and establish new upper and lower bounds on the space complexity of preserving their cuts. Specifically, we show that: 1) (1 ± ε) directed hypergraph spectral (respectively cut) sparsification on n vertices efficiently reduces to (1 ± ε) undirected hypergraph spectral (respectively cut) sparsification on n² + 1 vertices. Using the work of Lee and Jambulapati, Liu, and Sidford (STOC 2023) this gives us directed hypergraph spectral sparsifiers with O(n² log²(n) / ε²) hyperedges and directed hypergraph cut sparsifiers with O(n² log(n)/ ε²) hyperedges by using the work of Chen, Khanna, and Nagda (FOCS 2020), both of which improve upon the work of Oko, Sakaue, and Tanigawa (ICALP 2023). 2) Any cut sketching scheme which preserves all cuts in any directed hypergraph on n vertices to a (1 ± ε) factor (for ε = 1/(2^{O(√{log(n)})})) must have worst-case bit complexity n^{3 - o(1)}. Because directed hypergraphs are a subclass of submodular hypergraphs, this also shows a worst-case sketching lower bound of n^{3 - o(1)} bits for sketching cuts in general submodular hypergraphs. 3) (1 ± ε) monotone submodular hypergraph cut sparsification on n vertices efficiently reduces to (1 ± ε) symmetric submodular hypergraph sparsification on n+1 vertices. Using the work of Jambulapati et. al. (FOCS 2023) this gives us monotone submodular hypergraph sparsifiers with Õ(n / ε²) hyperedges, improving on the O(n³ / ε²) hyperedge bound of Kenneth and Krauthgamer (arxiv 2023). At a high level, our results use the same general principle, namely, by showing that cuts in one class of hypergraphs can be simulated by cuts in a simpler class of hypergraphs, we can leverage sparsification results for the simpler class of hypergraphs. 
    more » « less
  5. Bringmann, Karl; Grohe, Martin; Puppis, Gabriele; Svensson, Ola (Ed.)
    Recently, a number of variants of the notion of cut-preserving hypergraph sparsification have been studied in the literature. These variants include directed hypergraph sparsification, submodular hypergraph sparsification, general notions of approximation including spectral approximations, and more general notions like sketching that can answer cut queries using more general data structures than just sparsifiers. In this work, we provide reductions between these different variants of hypergraph sparsification and establish new upper and lower bounds on the space complexity of preserving their cuts. Specifically, we show that: 1) (1 ± ε) directed hypergraph spectral (respectively cut) sparsification on n vertices efficiently reduces to (1 ± ε) undirected hypergraph spectral (respectively cut) sparsification on n² + 1 vertices. Using the work of Lee and Jambulapati, Liu, and Sidford (STOC 2023) this gives us directed hypergraph spectral sparsifiers with O(n² log²(n) / ε²) hyperedges and directed hypergraph cut sparsifiers with O(n² log(n)/ ε²) hyperedges by using the work of Chen, Khanna, and Nagda (FOCS 2020), both of which improve upon the work of Oko, Sakaue, and Tanigawa (ICALP 2023). 2) Any cut sketching scheme which preserves all cuts in any directed hypergraph on n vertices to a (1 ± ε) factor (for ε = 1/(2^{O(√{log(n)})})) must have worst-case bit complexity n^{3 - o(1)}. Because directed hypergraphs are a subclass of submodular hypergraphs, this also shows a worst-case sketching lower bound of n^{3 - o(1)} bits for sketching cuts in general submodular hypergraphs. 3) (1 ± ε) monotone submodular hypergraph cut sparsification on n vertices efficiently reduces to (1 ± ε) symmetric submodular hypergraph sparsification on n+1 vertices. Using the work of Jambulapati et. al. (FOCS 2023) this gives us monotone submodular hypergraph sparsifiers with Õ(n / ε²) hyperedges, improving on the O(n³ / ε²) hyperedge bound of Kenneth and Krauthgamer (arxiv 2023). At a high level, our results use the same general principle, namely, by showing that cuts in one class of hypergraphs can be simulated by cuts in a simpler class of hypergraphs, we can leverage sparsification results for the simpler class of hypergraphs. 
    more » « less