skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Random Graph Matching with Improved Noise Robustness
Graph matching, also known as network alignment, refers to finding a bijection between the vertex sets of two given graphs so as to maximally align their edges. This fundamental computational problem arises frequently in multiple fields such as computer vision and biology. Recently, there has been a plethora of work studying efficient algorithms for graph matching under probabilistic models. In this work, we propose a new algorithm for graph matching: Our algorithm associates each vertex with a signature vector using a multistage procedure and then matches a pair of vertices from the two graphs if their signature vectors are close to each other. We show that, for two Erdős–Rényi graphs with edge correlation $$1-\alpha$$, our algorithm recovers the underlying matching exactly with high probability when $$\alpha \le 1 / (\log \log n)^C$$, where $$n$$ is the number of vertices in each graph and $$C$$ denotes a positive universal constant. This improves the condition $$\alpha \le 1 / (\log n)^C$$ achieved in previous work.  more » « less
Award ID(s):
2053333
PAR ID:
10317727
Author(s) / Creator(s):
; ;
Editor(s):
Belkin, Mikhail; Kpotufe, Samory
Date Published:
Journal Name:
Proceedings of Machine Learning Research
Volume:
134
ISSN:
2640-3498
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Belkin, Mikhail; Samory Kpotufe (Ed.)
    Graph matching, also known as network alignment, refers to finding a bijection between the vertex sets of two given graphs so as to maximally align their edges. This fundamental computational problem arises frequently in multiple fields such as computer vision and biology. Recently, there has been a plethora of work studying efficient algorithms for graph matching under probabilistic models. In this work, we propose a new algorithm for graph matching: Our algorithm associates each vertex with a signature vector using a multistage procedure and then matches a pair of vertices from the two graphs if their signature vectors are close to each other. We show that, for two Erdős–Rényi graphs with edge correlation 1−α, our algorithm recovers the underlying matching exactly with high probability when α≤1/(loglogn)C, where n is the number of vertices in each graph and C denotes a positive universal constant. This improves the condition α≤1/(logn)C achieved in previous work. 
    more » « less
  2. An efficient implicit representation of an n-vertex graph G in a family F of graphs assigns to each vertex of G a binary code of length O(log n) so that the adjacency between every pair of vertices can be determined only as a function of their codes. This function can depend on the family but not on the individual graph. Every family of graphs admitting such a representation contains at most 2^O(n log(n)) graphs on n vertices, and thus has at most factorial speed of growth. The Implicit Graph Conjecture states that, conversely, every hereditary graph family with at most factorial speed of growth admits an efficient implicit representation. We refute this conjecture by establishing the existence of hereditary graph families with factorial speed of growth that require codes of length n^Ω(1). 
    more » « less
  3. null (Ed.)
    This paper studies seeded graph matching for power-law graphs. Assume that two edge-correlated graphs are independently edge-sampled from a common parent graph with a power-law degree distribution. A set of correctly matched vertex-pairs is chosen at random and revealed as initial seeds. Our goal is to use the seeds to recover the remaining latent vertex correspondence between the two graphs. Departing from the existing approaches that focus on the use of high-degree seeds in $$1$$-hop neighborhoods, we develop an efficient algorithm that exploits the low-degree seeds in suitably-defined D-hop neighborhoods. Specifically, we first match a set of vertex-pairs with appropriate degrees (which we refer to as the first slice) based on the number of low-degree seeds in their D-hop neighborhoods. This approach significantly reduces the number of initial seeds needed to trigger a cascading process to match the rest of graphs. Under the Chung-Lu random graph model with n vertices, max degree Θ(√n), and the power-law exponent 2<β<3, we show that as soon as D> 4-β/3-β, by optimally choosing the first slice, with high probability our algorithm can correctly match a constant fraction of the true pairs without any error, provided with only Ω((log n)4-β) initial seeds. Our result achieves an exponential reduction in the seed size requirement, as the best previously known result requires n1/2+ε seeds (for any small constant ε>0). Performance evaluation with synthetic and real data further corroborates the improved performance of our algorithm. 
    more » « less
  4. Woodruff, David P. (Ed.)
    We give improved algorithms for maintaining edge-orientations of a fully-dynamic graph, such that the maximum out-degree is bounded. On one hand, we show how to orient the edges such that maximum out-degree is proportional to the arboricity $$\alpha$$ of the graph, in, either, an amortised update time of $$O(\log^2 n \log \alpha)$$, or a worst-case update time of $$O(\log^3 n \log \alpha)$$. On the other hand, motivated by applications including dynamic maximal matching, we obtain a different trade-off. Namely, the improved update time of either $$O(\log n \log \alpha)$$, amortised, or $$O(\log ^2 n \log \alpha)$$, worst-case, for the problem of maintaining an edge-orientation with at most $$O(\alpha + \log n)$$ out-edges per vertex. Finally, all of our algorithms naturally limit the recourse to be polylogarithmic in $$n$$ and $$\alpha$$. Our algorithms adapt to the current arboricity of the graph, and yield improvements over previous work: Firstly, we obtain deterministic algorithms for maintaining a $$(1+\varepsilon)$$ approximation of the maximum subgraph density, $$\rho$$, of the dynamic graph. Our algorithms have update times of $$O(\varepsilon^{-6}\log^3 n \log \rho)$$ worst-case, and $$O(\varepsilon^{-4}\log^2 n \log \rho)$$ amortised, respectively. We may output a subgraph $$H$$ of the input graph where its density is a $$(1+\varepsilon)$$ approximation of the maximum subgraph density in time linear in the size of the subgraph. These algorithms have improved update time compared to the $$O(\varepsilon^{-6}\log ^4 n)$$ algorithm by Sawlani and Wang from STOC 2020. Secondly, we obtain an $$O(\varepsilon^{-6}\log^3 n \log \alpha)$$ worst-case update time algorithm for maintaining a $$(1~+~\varepsilon)\textnormal{OPT} + 2$$ approximation of the optimal out-orientation of a graph with adaptive arboricity $$\alpha$$, improving the $$O(\varepsilon^{-6}\alpha^2 \log^3 n)$$ algorithm by Christiansen and Rotenberg from ICALP 2022. This yields the first worst-case polylogarithmic dynamic algorithm for decomposing into $$O(\alpha)$$ forests. Thirdly, we obtain arboricity-adaptive fully-dynamic deterministic algorithms for a variety of problems including maximal matching, $$\Delta+1$$ colouring, and matrix vector multiplication. All update times are worst-case $$O(\alpha+\log^2n \log \alpha)$$, where $$\alpha$$ is the current arboricity of the graph. For the maximal matching problem, the state-of-the-art deterministic algorithms by Kopelowitz, Krauthgamer, Porat, and Solomon from ICALP 2014 runs in time $$O(\alpha^2 + \log^2 n)$$, and by Neiman and Solomon from STOC 2013 runs in time $$O(\sqrt{m})$$. We give improved running times whenever the arboricity $$\alpha \in \omega( \log n\sqrt{\log\log n})$$. 
    more » « less
  5. null (Ed.)
    Given a weighted planar bipartite graph G ( A ∪ B , E ) where each edge has an integer edge cost, we give an Õ( n 4/3 log nC ) time algorithm to compute minimum-cost perfect matching; here C is the maximum edge cost in the graph. The previous best-known planarity exploiting algorithm has a running time of O ( n 3/2 log n ) and is achieved by using planar separators (Lipton and Tarjan ’80). Our algorithm is based on the bit-scaling paradigm (Gabow and Tarjan ’89). For each scale, our algorithm first executes O ( n 1/3 ) iterations of Gabow and Tarjan’s algorithm in O ( n 4/3 ) time leaving only O ( n 2/3 ) vertices unmatched. Next, it constructs a compressed residual graph H with O ( n 2/3 ) vertices and O ( n ) edges. This is achieved by using an r -division of the planar graph G with r = n 2/3 . For each partition of the r -division, there is an edge between two vertices of H if and only if they are connected by a directed path inside the partition. Using existing efficient shortest-path data structures, the remaining O ( n 2/3 ) vertices are matched by iteratively computing a minimum-cost augmenting path, each taking Õ( n 2/3 ) time. Augmentation changes the residual graph, so the algorithm updates the compressed representation for each partition affected by the change in Õ( n 2/3 ) time. We bound the total number of affected partitions over all the augmenting paths by O ( n 2/3 log n ). Therefore, the total time taken by the algorithm is Õ( n 4/3 ). 
    more » « less