We consider the maximum matching problem in the semi-streaming model formalized by Feigenbaum, Kannan, McGregor, Suri, and Zhang that is inspired by giant graphs of today. As our main result, we give a two-pass (1/2 + 1/16)-approximation algorithm for triangle-free graphs and a two-pass (1/2 + 1/32)-approximation algorithm for general graphs; these improve the approximation ratios of 1/2 + 1/52 for bipartite graphs and 1/2 + 1/140 for general graphs by Konrad, Magniez, and Mathieu. In three passes, we are able to achieve approximation ratios of 1/2 + 1/10 for triangle-free graphs and 1/2 + 1/19.753 for general graphs. We also give a multi-pass algorithm where we bound the number of passes precisely—we give a (2/3 − ε)- approximation algorithm that uses 2/(3ε) passes for triangle-free graphs and 4/(3ε) passes for general graphs. Our algorithms are simple and combinatorial, use O(n log n) space, and (can be implemented to) have O(1) update time per edge.
For general graphs, our multi-pass algorithm improves the best known deterministic algorithms in terms of the number of passes:
* Ahn and Guha give a (2/3−ε)-approximation algorithm that uses O(log(1/ε)/ε2) passes, whereas our (2/3 − ε)-approximation algorithm uses 4/(3ε) passes;
* they also give a (1 − ε)-approximation algorithm that uses O(log n · poly(1/ε)) passes, where n is the number of vertices of the input graph; although our algorithm is (2/3−ε)-approximation, our number of passes do not depend on n.
Earlier multi-pass algorithms either have a large constant inside big-O notation for the number of passes or the constant cannot be determined due to the involved analysis, so our multi-pass algorithm should use much fewer passes for approximation ratios bounded slightly below 2/3.
more »
« less
Matchings in Low-Arboricity Graphs in the Dynamic Graph Stream Model
We consider the problem of estimating the size of a maximum matching in low-arboricity graphs in the dynamic graph stream model. In this setting, an algorithm with limited memory makes multiple passes over a stream of edge insertions and deletions, resulting in a low-arboricity graph. Let n be the number of vertices of the input graph, and α be its arboricity. We give the following results.
1) As our main result, we give a three-pass streaming algorithm that produces an (α + 2)(1 + ε)-approximation and uses space O(ε^{-2}⋅α²⋅n^{1/2}⋅log n). This result should be contrasted with the Ω(α^{-5/2}⋅n^{1/2}) space lower bound established by [Assadi et al., SODA'17] for one-pass algorithms, showing that, for graphs of constant arboricity, the one-pass space lower bound can be achieved in three passes (up to poly-logarithmic factors). Furthermore, we obtain a two-pass algorithm that uses space O(ε^{-2}⋅α²⋅n^{3/5}⋅log n).
2) We also give a (1+ε)-approximation multi-pass algorithm, where the space used is parameterized by an upper bound on the size of a largest matching. For example, using O(log log n) passes, the space required is O(ε^{-1}⋅α²⋅k⋅log n), where k denotes an upper bound on the size of a largest matching. Finally, we define a notion of arboricity in the context of matrices. This is a natural measure of the sparsity of a matrix that is more nuanced than simply bounding the total number of nonzero entries, but less restrictive than bounding the number of nonzero entries in each row and column. For such matrices, we exploit our results on estimating matching size to present upper bounds for the problem of rank estimation in the dynamic data stream model.
more »
« less
- Award ID(s):
- 1934846
- PAR ID:
- 10560996
- Editor(s):
- Barman, Siddharth; Lasota, Sławomir
- Publisher / Repository:
- Schloss Dagstuhl – Leibniz-Zentrum für Informatik
- Date Published:
- Volume:
- 323
- ISSN:
- 1868-8969
- ISBN:
- 978-3-95977-355-3
- Page Range / eLocation ID:
- 323-323
- Subject(s) / Keyword(s):
- Data Streams Graph Matching Graph Arboricity Theory of computation → Sketching and sampling
- Format(s):
- Medium: X Size: 15 pages; 813817 bytes Other: application/pdf
- Size(s):
- 15 pages 813817 bytes
- Right(s):
- Creative Commons Attribution 4.0 International license; info:eu-repo/semantics/openAccess
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Chan, Timothy ; Fischer, Johannes ; Iacono, John ; Herman, Grzegorz (Ed.)The maximum coverage problem is to select k sets, from a collection of m sets, such that the cardinality of their union, in a universe of size n, is maximized. We consider (1-1/e-ε)-approximation algorithms for this NP-hard problem in three standard data stream models. 1) Dynamic Model. The stream consists of a sequence of sets being inserted and deleted. Our multi-pass algorithm uses ε^{-2} k ⋅ polylog(n,m) space. The best previous result (Assadi and Khanna, SODA 2018) used (n +ε^{-4} k) polylog(n,m) space. While both algorithms use O(ε^{-1} log m) passes, our analysis shows that, when ε ≤ 1/log log m, it is possible to reduce the number of passes by a 1/log log m factor without incurring additional space. 2) Random Order Model. In this model, there are no deletions, and the sets forming the instance are uniformly randomly permuted to form the input stream. We show that a single pass and k polylog(n,m) space suffices for arbitrary small constant ε. The best previous result, by Warneke et al. (ESA 2023), used k² polylog(n,m) space. 3) Insert-Only Model. Lastly, our results, along with numerous previous results, use a sub-sampling technique introduced by McGregor and Vu (ICDT 2017) to sparsify the input instance. We explain how this technique and others used in the paper can be implemented such that the amortized update time of our algorithm is polylogarithmic. This also implies an improvement of the state-of-the-art insert only algorithms in terms of the update time: polylog(m,n) update time suffices, whereas the best previous result by Jaud et al. (SEA 2023) required update time that was linear in k.more » « less
-
Woodruff, David P. (Ed.)We give improved algorithms for maintaining edge-orientations of a fully-dynamic graph, such that the maximum out-degree is bounded. On one hand, we show how to orient the edges such that maximum out-degree is proportional to the arboricity $\alpha$ of the graph, in, either, an amortised update time of $O(\log^2 n \log \alpha)$, or a worst-case update time of $O(\log^3 n \log \alpha)$. On the other hand, motivated by applications including dynamic maximal matching, we obtain a different trade-off. Namely, the improved update time of either $O(\log n \log \alpha)$, amortised, or $O(\log ^2 n \log \alpha)$, worst-case, for the problem of maintaining an edge-orientation with at most $O(\alpha + \log n)$ out-edges per vertex. Finally, all of our algorithms naturally limit the recourse to be polylogarithmic in $n$ and $\alpha$. Our algorithms adapt to the current arboricity of the graph, and yield improvements over previous work: Firstly, we obtain deterministic algorithms for maintaining a $(1+\varepsilon)$ approximation of the maximum subgraph density, $\rho$, of the dynamic graph. Our algorithms have update times of $O(\varepsilon^{-6}\log^3 n \log \rho)$ worst-case, and $O(\varepsilon^{-4}\log^2 n \log \rho)$ amortised, respectively. We may output a subgraph $H$ of the input graph where its density is a $(1+\varepsilon)$ approximation of the maximum subgraph density in time linear in the size of the subgraph. These algorithms have improved update time compared to the $O(\varepsilon^{-6}\log ^4 n)$ algorithm by Sawlani and Wang from STOC 2020. Secondly, we obtain an $O(\varepsilon^{-6}\log^3 n \log \alpha)$ worst-case update time algorithm for maintaining a $(1~+~\varepsilon)\textnormal{OPT} + 2$ approximation of the optimal out-orientation of a graph with adaptive arboricity $\alpha$, improving the $O(\varepsilon^{-6}\alpha^2 \log^3 n)$ algorithm by Christiansen and Rotenberg from ICALP 2022. This yields the first worst-case polylogarithmic dynamic algorithm for decomposing into $O(\alpha)$ forests. Thirdly, we obtain arboricity-adaptive fully-dynamic deterministic algorithms for a variety of problems including maximal matching, $\Delta+1$ colouring, and matrix vector multiplication. All update times are worst-case $O(\alpha+\log^2n \log \alpha)$, where $\alpha$ is the current arboricity of the graph. For the maximal matching problem, the state-of-the-art deterministic algorithms by Kopelowitz, Krauthgamer, Porat, and Solomon from ICALP 2014 runs in time $O(\alpha^2 + \log^2 n)$, and by Neiman and Solomon from STOC 2013 runs in time $O(\sqrt{m})$. We give improved running times whenever the arboricity $\alpha \in \omega( \log n\sqrt{\log\log n})$.more » « less
-
We consider the problem of space-efficiently estimating the number of simplices in a hypergraph stream. This is the most natural hypergraph generalization of the highly-studied problem of estimating the number of triangles in a graph stream. Our input is a k-uniform hypergraph H with n vertices and m hyperedges, each hyperedge being a k-sized subset of vertices. A k-simplex in H is a subhypergraph on k+1 vertices X such that all k+1 possible hyperedges among X exist in H. The goal is to process the hyperedges of H, which arrive in an arbitrary order as a data stream, and compute a good estimate of T_k(H), the number of k-simplices in H. We design a suite of algorithms for this problem. As with triangle-counting in graphs (which is the special case k = 2), sublinear space is achievable but only under a promise of the form T_k(H) ≥ T. Under such a promise, our algorithms use at most four passes and together imply a space bound of O(ε^{-2} log δ^{-1} polylog n ⋅ min{(m^{1+1/k})/T, m/(T^{2/(k+1)})}) for each fixed k ≥ 3, in order to guarantee an estimate within (1±ε)T_k(H) with probability ≥ 1-δ. We also give a simpler 1-pass algorithm that achieves O(ε^{-2} log δ^{-1} log n⋅ (m/T) (Δ_E + Δ_V^{1-1/k})) space, where Δ_E (respectively, Δ_V) denotes the maximum number of k-simplices that share a hyperedge (respectively, a vertex), which generalizes a previous result for the k = 2 case. We complement these algorithmic results with space lower bounds of the form Ω(ε^{-2}), Ω(m^{1+1/k}/T), Ω(m/T^{1-1/k}) and Ω(mΔ_V^{1/k}/T) for multi-pass algorithms and Ω(mΔ_E/T) for 1-pass algorithms, which show that some of the dependencies on parameters in our upper bounds are nearly tight. Our techniques extend and generalize several different ideas previously developed for triangle counting in graphs, using appropriate innovations to handle the more complicated combinatorics of hypergraphs.more » « less
-
Bringmann, Karl ; Grohe, Martin ; Puppis, Gabriele ; Svensson, Ola (Ed.)We revisit the noisy binary search model of [Karp and Kleinberg, 2007], in which we have n coins with unknown probabilities p_i that we can flip. The coins are sorted by increasing p_i, and we would like to find where the probability crosses (to within ε) of a target value τ. This generalized the fixed-noise model of [Burnashev and Zigangirov, 1974], in which p_i = 1/2 ± ε, to a setting where coins near the target may be indistinguishable from it. It was shown in [Karp and Kleinberg, 2007] that Θ(1/ε² log n) samples are necessary and sufficient for this task. We produce a practical algorithm by solving two theoretical challenges: high-probability behavior and sharp constants. We give an algorithm that succeeds with probability 1-δ from 1/C_{τ, ε} ⋅ (log₂ n + O(log^{2/3} n log^{1/3} 1/(δ) + log 1/(δ))) samples, where C_{τ, ε} is the optimal such constant achievable. For δ > n^{-o(1)} this is within 1 + o(1) of optimal, and for δ ≪ 1 it is the first bound within constant factors of optimal.more » « less