skip to main content

Title: Computing Zigzag Persistence on Graphs in Near-Linear Time
Graphs model real-world circumstances in many applications where they may constantly change to capture the dynamic behavior of the phenomena. Topological persistence which provides a set of birth and death pairs for the topological features is one instrument for analyzing such changing graph data. However, standard persistent homology defined over a growing space cannot always capture such a dynamic process unless shrinking with deletions is also allowed. Hence, zigzag persistence which incorporates both insertions and deletions of simplices is more appropriate in such a setting. Unlike standard persistence which admits nearly linear-time algorithms for graphs, such results for the zigzag version improving the general O(m^ω) time complexity are not known, where ω < 2.37286 is the matrix multiplication exponent. In this paper, we propose algorithms for zigzag persistence on graphs which run in near-linear time. Specifically, given a filtration with m additions and deletions on a graph with n vertices and edges, the algorithm for 0-dimension runs in O(mlog² n+mlog m) time and the algorithm for 1-dimension runs in O(mlog⁴ n) time. The algorithm for 0-dimension draws upon another algorithm designed originally for pairing critical points of Morse functions on 2-manifolds. The algorithm for 1-dimension pairs a negative edge with the earliest positive edge so that a 1-cycle containing both edges resides in all intermediate graphs. Both algorithms achieve the claimed time complexity via dynamic graph data structures proposed by Holm et al. In the end, using Alexander duality, we extend the algorithm for 0-dimension to compute the (p-1)-dimensional zigzag persistence for ℝ^p-embedded complexes in O(mlog² n+mlog m+nlog n) time.  more » « less
Award ID(s):
Author(s) / Creator(s):
Date Published:
Journal Name:
Leibniz international proceedings in informatics
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Dyck-reachability is a fundamental formulation for program analysis, which has been widely used to capture properly-matched-parenthesis program properties such as function calls/returns and field writes/reads. Bidirected Dyck-reachability is a relaxation of Dyck-reachability on bidirected graphs where each edge u → ( i v labeled by an open parenthesis “( i ” is accompanied with an inverse edge v → ) i u labeled by the corresponding close parenthesis “) i ”, and vice versa. In practice, many client analyses such as alias analysis adopt the bidirected Dyck-reachability formulation. Bidirected Dyck-reachability admits an optimal reachability algorithm. Specifically, given a graph with n nodes and m edges, the optimal bidirected Dyck-reachability algorithm computes all-pairs reachability information in O ( m ) time. This paper focuses on the dynamic version of bidirected Dyck-reachability. In particular, we consider the problem of maintaining all-pairs Dyck-reachability information in bidirected graphs under a sequence of edge insertions and deletions. Dynamic bidirected Dyck-reachability can formulate many program analysis problems in the presence of code changes. Unfortunately, solving dynamic graph reachability problems is challenging. For example, even for maintaining transitive closure, the fastest deterministic dynamic algorithm requires O ( n 2 ) update time to achieve O (1) query time. All-pairs Dyck-reachability is a generalization of transitive closure. Despite extensive research on incremental computation, there is no algorithmic development on dynamic graph algorithms for program analysis with worst-case guarantees. Our work fills the gap and proposes the first dynamic algorithm for Dyck reachability on bidirected graphs. Our dynamic algorithms can handle each graph update ( i.e. , edge insertion and deletion) in O ( n ·α( n )) time and support any all-pairs reachability query in O (1) time, where α( n ) is the inverse Ackermann function. We have implemented and evaluated our dynamic algorithm on an alias analysis and a context-sensitive data-dependence analysis for Java. We compare our dynamic algorithms against a straightforward approach based on the O ( m )-time optimal bidirected Dyck-reachability algorithm and a recent incremental Datalog solver. Experimental results show that our algorithm achieves orders of magnitude speedup over both approaches. 
    more » « less
  2. We study the fully dynamic All-Pairs Shortest Paths (APSP) problem in undirected edge-weighted graphs. Given an n-vertex graph G with non-negative edge lengths, that undergoes an online sequence of edge insertions and deletions, the goal is to support approximate distance queries and shortest-path queries. We provide a deterministic algorithm for this problem, that, for a given precision parameter є, achieves approximation factor (loglogn)2O(1/є3), and has amortized update time O(nєlogL) per operation, where L is the ratio of longest to shortest edge length. Query time for distance-query is O(2O(1/є)· logn· loglogL), and query time for shortest-path query is O(|E(P)|+2O(1/є)· logn· loglogL), where P is the path that the algorithm returns. To the best of our knowledge, even allowing any o(n)-approximation factor, no adaptive-update algorithms with better than Θ(m) amortized update time and better than Θ(n) query time were known prior to this work. We also note that our guarantees are stronger than the best current guarantees for APSP in decremental graphs in the adaptive-adversary setting. In order to obtain these results, we consider an intermediate problem, called Recursive Dynamic Neighborhood Cover (RecDynNC), that was formally introduced in [Chuzhoy, STOC ’21]. At a high level, given an undirected edge-weighted graph G undergoing an online sequence of edge deletions, together with a distance parameter D, the goal is to maintain a sparse D-neighborhood cover of G, with some additional technical requirements. Our main technical contribution is twofolds. First, we provide a black-box reduction from APSP in fully dynamic graphs to the RecDynNC problem. Second, we provide a new deterministic algorithm for the RecDynNC problem, that, for a given precision parameter є, achieves approximation factor (loglogm)2O(1/є2), with total update time O(m1+є), where m is the total number of edges ever present in G. This improves the previous algorithm of [Chuzhoy, STOC ’21], that achieved approximation factor (logm)2O(1/є) with similar total update time. Combining these two results immediately leads to the deterministic algorithm for fully-dynamic APSP with the guarantees stated above. 
    more » « less
  3. null (Ed.)
    We present an $\tilde O(m+n^{1.5})$-time randomized algorithm for maximum cardinality bipartite matching and related problems (e.g. transshipment, negative-weight shortest paths, and optimal transport) on $m$-edge, $n$-node graphs. For maximum cardinality bipartite matching on moderately dense graphs, i.e. $m = \Omega(n^{1.5})$, our algorithm runs in time nearly linear in the input size and constitutes the first improvement over the classic $O(m\sqrt{n})$-time [Dinic 1970; Hopcroft-Karp 1971; Karzanov 1973] and $\tilde O(n^\omega)$-time algorithms [Ibarra-Moran 1981] (where currently $\omega\approx 2.373$). On sparser graphs, i.e. when $m = n^{9/8 + \delta}$ for any constant $\delta>0$, our result improves upon the recent advances of [Madry 2013] and [Liu-Sidford 2020b, 2020a] which achieve an $\tilde O(m^{4/3+o(1)})$ runtime. We obtain these results by combining and advancing recent lines of research in interior point methods (IPMs) and dynamic graph algorithms. First, we simplify and improve the IPM of [v.d.Brand-Lee-Sidford-Song 2020], providing a general primal-dual IPM framework and new sampling-based techniques for handling infeasibility induced by approximate linear system solvers. Second, we provide a simple sublinear-time algorithm for detecting and sampling high-energy edges in electric flows on expanders and show that when combined with recent advances in dynamic expander decompositions, this yields efficient data structures for maintaining the iterates of both [v.d.Brand~et~al.] and our new IPMs. Combining this general machinery yields a simpler $\tilde O(n \sqrt{m})$ time algorithm for matching based on the logarithmic barrier function, and our state-of-the-art $\tilde O(m+n^{1.5})$ time algorithm for matching based on the [Lee-Sidford 2014] barrier (as regularized in [v.d.Brand~et~al.]). 
    more » « less
  4. We present an algorithm that, with high probability, generates a random spanning tree from an edge-weighted undirected graph in \Otil(n^{5/3 }m^{1/3}) time\footnote{The \Otil(\cdot) notation hides \poly(\log n) factors}. The tree is sampled from a distribution where the probability of each tree is proportional to the product of its edge weights. This improves upon the previous best algorithm due to Colbourn et al. that runs in matrix multiplication time, O(n^\omega). For the special case of unweighted graphs, this improves upon the best previously known running time of \tilde{O}(\min\{n^{\omega},m\sqrt{n},m^{4/3}\}) for m >> n^{7/4} (Colbourn et al. '96, Kelner-Madry '09, Madry et al. '15). The effective resistance metric is essential to our algorithm, as in the work of Madry et al., but we eschew determinant-based and random walk-based techniques used by previous algorithms. Instead, our algorithm is based on Gaussian elimination, and the fact that effective resistance is preserved in the graph resulting from eliminating a subset of vertices (called a Schur complement). As part of our algorithm, we show how to compute \eps-approximate effective resistances for a set SS of vertex pairs via approximate Schur complements in \Otil(m+(n + |S|)\eps^{-2}) time, without using the Johnson-Lindenstrauss lemma which requires \Otil( \min\{(m + |S|)\eps^{-2}, m+n\eps^{-4} +|S|\eps^{-2}\}) time. We combine this approximation procedure with an error correction procedure for handing edges where our estimate isn't sufficiently accurate. 
    more » « less
  5. We study local symmetry breaking problems in the Congest model, focusing on ruling set problems, which generalize the fundamental Maximal Independent Set (MIS) problem. The time (round) complexity of MIS (and ruling sets) have attracted much attention in the Local model. Indeed, recent results (Barenboim et al., FOCS 2012, Ghaffari SODA 2016) for the MIS problem have tried to break the long-standing O(log n)-round "barrier" achieved by Luby's algorithm, but these yield o(log n)-round complexity only when the maximum degree Delta is somewhat small relative to n. More importantly, these results apply only in the Local model. In fact, the best known time bound in the Congest model is still O(log n) (via Luby's algorithm) even for moderately small Delta (i.e., for Delta = Omega(log n) and Delta = o(n)). Furthermore, message complexity has been largely ignored in the context of local symmetry breaking. Luby's algorithm takes O(m) messages on m-edge graphs and this is the best known bound with respect to messages. Our work is motivated by the following central question: can we break the Theta(log n) time complexity barrier and the Theta(m) message complexity barrier in the Congest model for MIS or closely-related symmetry breaking problems? This paper presents progress towards this question for the distributed ruling set problem in the Congest model. A beta-ruling set is an independent set such that every node in the graph is at most beta hops from a node in the independent set. We present the following results: - Time Complexity: We show that we can break the O(log n) "barrier" for 2- and 3-ruling sets. We compute 3-ruling sets in O(log n/log log n) rounds with high probability (whp). More generally we show that 2-ruling sets can be computed in O(log Delta (log n)^(1/2 + epsilon) + log n/log log n) rounds for any epsilon > 0, which is o(log n) for a wide range of Delta values (e.g., Delta = 2^(log n)^(1/2-epsilon)). These are the first 2- and 3-ruling set algorithms to improve over the O(log n)-round complexity of Luby's algorithm in the Congest model. - Message Complexity: We show an Omega(n^2) lower bound on the message complexity of computing an MIS (i.e., 1-ruling set) which holds also for randomized algorithms and present a contrast to this by showing a randomized algorithm for 2-ruling sets that, whp, uses only O(n log^2 n) messages and runs in O(Delta log n) rounds. This is the first message-efficient algorithm known for ruling sets, which has message complexity nearly linear in n (which is optimal up to a polylogarithmic factor). 
    more » « less