We present a new technique for efficiently removing almost all short cycles in a graph without unintentionally removing its triangles. Consequently, triangle finding problems do not become easy even in almost kcycle free graphs, for any constant k≥ 4.
Triangle finding is at the base of many conditional lower bounds in P, mainly for distance computation problems, and the existence of many 4 or 5cycles in a worstcase instance had been the obstacle towards resolving major open questions.
Hardness of approximation: Are there distance oracles with m1+o(1) preprocessing time and mo(1) query time that achieve a constant approximation? Existing algorithms with such desirable time bounds only achieve superconstant approximation factors, while only 3− factors were conditionally ruled out (Pătraşcu, Roditty, and Thorup; FOCS 2012). We prove that no O(1) approximations are possible, assuming the 3SUM or APSP conjectures. In particular, we prove that kapproximations require Ω(m1+1/ck) time, which is tight up to the constant c. The lower bound holds even for the offline version where we are given the queries in advance, and extends to other problems such as dynamic shortest paths. The 4Cycle problem: An infamous open question in finegrained complexity is to establish any surprising consequences from a subquadratic or even lineartime algorithm for detecting a 4cycle in a graph. This is arguably one of the simplest problems without a nearlinear time algorithm nor a conditional lower bound. We prove that Ω(m1.1194) time is needed for kcycle detection for all k≥ 4, unless we can detect a triangle in √ndegree graphs in O(n2−δ) time; a breakthrough that is not known to follow even from optimal matrix multiplication algorithms.
more »
« less
The Complexity of AverageCase Dynamic Subgraph Counting
Statistics of small subgraph counts such as triangles, fourcycles, and st paths of short lengths reveal important structural properties of the underlying graph. These problems have been widely studied in social network analysis. In most relevant applications, the graphs are not only massive but also change dynamically over time. Most of these problems become hard in the dynamic setting when considering the worst case. In this paper, we ask whether the question of small subgraph counting over dynamic graphs is hard also in the average case.
We consider the simplest possible average case model where the updates follow an ErdősRényi graph: each update selects a pair of vertices (u, v) uniformly at random and flips the existence of the edge (u, v). We develop new lower bounds and matching algorithms in this model for counting fourcycles, counting triangles through a specified point s, or a random queried point, and st paths of length 3, 4 and 5. Our results indicate while computing st paths of length 3, and 4 are easy in the average case with O(1) update time (note that they are hard in the worst case), it becomes hard when considering st paths of length 5.
We introduce new techniques which allow us to get averagecase hardness for these graph problems from the worstcase hardness of the Online Matrix vector problem (OMv). Our techniques rely on recent advances in finegrained averagecase complexity. Our techniques advance this literature, giving the ability to prove new lower bounds on averagecase dynamic algorithms.
Read More: https://epubs.siam.org/doi/abs/10.1137/1.9781611977073.23
more »
« less
 NSFPAR ID:
 10314273
 Date Published:
 Journal Name:
 Proceedings of the 2022 Annual ACMSIAM Symposium on Discrete Algorithms (SODA)
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


We study graph computations in an enhanced data streaming setting, where a spacebounded client reading the edge stream of a massive graph may delegate some of its work to a cloud service. We seek algorithms that allow the client to verify a purported proof sent by the cloud service that the work done in the cloud is correct. A line of work starting with Chakrabarti et al. (ICALP 2009) has provided such algorithms, which we call schemes, for several statistical and graphtheoretic problems, many of which exhibit a tradeoff between the length of the proof and the space used by the streaming verifier. This work designs new schemes for a number of basic graph problems  including triangle counting, maximum matching, topological sorting, and singlesource shortest paths  where past work had either failed to obtain smooth tradeoffs between these two key complexity measures or only obtained suboptimal tradeoffs. Our key innovation is having the verifier compute certain nonlinear sketches of the input stream, leading to either new or improved tradeoffs. In many cases, our schemes, in fact, provide optimal tradeoffs up to logarithmic factors. Specifically, for most graph problems that we study, it is known that the product of the verifier’s space cost v and the proof length h must be at least Omega(n^2) for nvertex graphs. However, matching upper bounds are only known for a handful of settings of h and v on the curve h*v = ~Theta(n^2). For example, for counting triangles and maximum matching, schemes with costs lying on this curve are only known for (h = ~O(n²), v = ~O(1)), (h = ~O(n), v = ~O(n)), and the trivial (h = ~O(1), v = ~O(n²)). A major message of this work is that by exploiting nonlinear sketches, a significant "portion" of costs on the tradeoff curve h*v=n^2 can be achieved.more » « less

Dyckreachability is a fundamental formulation for program analysis, which has been widely used to capture properlymatchedparenthesis program properties such as function calls/returns and field writes/reads. Bidirected Dyckreachability is a relaxation of Dyckreachability on bidirected graphs where each edge u → ( i v labeled by an open parenthesis “( i ” is accompanied with an inverse edge v → ) i u labeled by the corresponding close parenthesis “) i ”, and vice versa. In practice, many client analyses such as alias analysis adopt the bidirected Dyckreachability formulation. Bidirected Dyckreachability admits an optimal reachability algorithm. Specifically, given a graph with n nodes and m edges, the optimal bidirected Dyckreachability algorithm computes allpairs reachability information in O ( m ) time. This paper focuses on the dynamic version of bidirected Dyckreachability. In particular, we consider the problem of maintaining allpairs Dyckreachability information in bidirected graphs under a sequence of edge insertions and deletions. Dynamic bidirected Dyckreachability can formulate many program analysis problems in the presence of code changes. Unfortunately, solving dynamic graph reachability problems is challenging. For example, even for maintaining transitive closure, the fastest deterministic dynamic algorithm requires O ( n 2 ) update time to achieve O (1) query time. Allpairs Dyckreachability is a generalization of transitive closure. Despite extensive research on incremental computation, there is no algorithmic development on dynamic graph algorithms for program analysis with worstcase guarantees. Our work fills the gap and proposes the first dynamic algorithm for Dyck reachability on bidirected graphs. Our dynamic algorithms can handle each graph update ( i.e. , edge insertion and deletion) in O ( n ·α( n )) time and support any allpairs reachability query in O (1) time, where α( n ) is the inverse Ackermann function. We have implemented and evaluated our dynamic algorithm on an alias analysis and a contextsensitive datadependence analysis for Java. We compare our dynamic algorithms against a straightforward approach based on the O ( m )time optimal bidirected Dyckreachability algorithm and a recent incremental Datalog solver. Experimental results show that our algorithm achieves orders of magnitude speedup over both approaches.more » « less

Over the last two decades, frameworks for distributedmemory parallel computation, such as MapReduce, Hadoop, Spark and Dryad, have gained significant popularity with the growing prevalence of large network datasets. The Massively Parallel Computation (MPC) model is the defacto standard for studying graph algorithms in these frameworks theoretically. Subgraph counting is one such fundamental problem in analyzing massive graphs, with the main algorithmic challenges centering on designing methods which are both scalable and accurate. Given a graph G = (V, E) with n vertices, m edges and T triangles, our first result is an algorithm that outputs a (1+ε)approximation to T, with asymptotically optimal round and total space complexity provided any S ≥ max{(√ m, n²/m)} space per machine and assuming T = Ω(√{m/n}). Our result gives a quadratic improvement on the bound on T over previous works. We also provide a simple extension of our result to counting any subgraph of k size for constant k ≥ 1. Our second result is an O_δ(log log n)round algorithm for exactly counting the number of triangles, whose total space usage is parametrized by the arboricity α of the input graph. We extend this result to exactly counting kcliques for any constant k. Finally, we prove that a recent result of Bera, Pashanasangi and Seshadhri (ITCS 2020) for exactly counting all subgraphs of size at most 5 can be implemented in the MPC model in Õ_δ(√{log n}) rounds, O(n^δ) space per machine and O(mα³) total space. In addition to our theoretical results, we simulate our triangle counting algorithms in realworld graphs obtained from the Stanford Network Analysis Project (SNAP) database. Our results show that both our approximate and exact counting algorithms exhibit improvements in terms of round complexity and approximation ratio, respectively, compared to two previous widely used algorithms for these problems.more » « less

We revisit the muchstudied problem of spaceefficiently estimating the number of triangles in a graph stream, and extensions of this problem to counting fixedsized cliques and cycles, obtaining a number of new upper and lower bounds. For the important special case of counting triangles, we give a $4$pass, $(1\pm\varepsilon)$approximate, randomized algorithm that needs at most $\widetilde{O}(\varepsilon^{2}\cdot m^{3/2}/T)$ space, where $m$ is the number of edges and $T$ is a promised lower bound on the number of triangles. This matches the space bound of a very recent algorithm (McGregor et al., PODS 2016), with an arguably simpler and more general technique. We give an improved multipass lower bound of $\Omega(\min\{m^{3/2}/T, m/\sqrt{T}\})$, applicable at essentially all densities $\Omega(n) \le m \le O(n^2)$. We also prove other multipass lower bounds in terms of various structural parameters of the input graph. Together, our results resolve a couple of open questions raised in recent work (Braverman et al., ICALP 2013). Our presentation emphasizes more general frameworks, for both upper and lower bounds. We give a sampling algorithm for counting arbitrary subgraphs and then improve it via combinatorial means in the special cases of counting odd cliques and odd cycles. Our results show that these problems are considerably easier in the cashregister streaming model than in the turnstile model, where previous work had focused (Manjunath et al., ESA 2011; Kane et al., ICALP 2012). We use Tur{\'a}n graphs and related gadgets to derive lower bounds for counting cliques and cycles, with trianglecounting lower bounds following as a corollary.more » « less