<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Dynamic (1+ϵ)-Approximate Matching Size in Truly Sublinear Update Time</title></titleStmt>
			<publicationStmt>
				<publisher>IEEE</publisher>
				<date>11/06/2023</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10536488</idno>
					<idno type="doi">10.1109/FOCS57990.2023.00095</idno>
					
					<author>Sayan Bhattacharya</author><author>Peter Kiss</author><author>Thatchaphol Saranurak</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[We show a fully dynamic algorithm for maintaining (1 + ǫ)-approximate size of maximum matching of the graph with n vertices and m edges using m 0.5-Ωǫ(1) update time. This is the first polynomial improvement over the long-standing O(n) update time, which can be trivially obtained by periodic recomputation. Thus, we resolve the value version of a major open question of the dynamic graph algorithms literature (see, e.g., [Gupta and Peng FOCS'13], [Bernstein  and Stein SODA'16], [Behnezhad and Khanna SODA'22]).Our key technical component is the first sublinear algorithm for (1, ǫn)-approximate maximum matching with sublinear running time on dense graphs. All previous algorithms suffered a multiplicative approximation factor of at least 1.499 or assumed that the graph has a very small maximum degree.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>We study the dynamic version of the maximum matching problem, a cornerstone of combinatorial optimization <ref type="bibr">[Kuh55,</ref><ref type="bibr">Edm65b,</ref><ref type="bibr">Edm65a]</ref>. In the dynamic matching problem, the task is to build a data structure that, given a graph G with n vertices and m edges undergoing both edge insertions and deletions, maintains an (approximate) maximum matching of G or, in the value version, just the size of the maximum matching, denoted by &#181;(G). The goal is to minimize the update time required to update the solution after each edge change.</p><p>The first non-trivial algorithm for this problem was by Sankowski <ref type="bibr">[San07]</ref> 15 years ago, which exactly maintains the maximum matching size using O(n 1.495 ) update time, which is recently improved to O(n 1.407 ) <ref type="bibr">[BNS19]</ref>. Unfortunately, this latter bound is tight under the hinted OMv conjecture <ref type="bibr">[BNS19]</ref>. Furthermore, in sparse graphs, even m 1-o(1) update time is required assuming the kcycle conjecture <ref type="bibr">[PGVWW20]</ref>. These strong conditional lower bounds have shifted the attention of researchers to approximate matching. An &#945;-approximate matching is a matching of size at least &#181;(G)/&#945;. The following has become one of the holy-grail questions in the dynamic graph algorithms and fine-grained complexity communities [ARW17]: Question 1.1. Is there a dynamic (1 + &#491;)-approximate matching algorithm with polylogarithmic update time for an arbitrarily small constant &#491;?</p><p>The current state of the art is, however, still very far from this goal. A straightforward algorithm with O(n) amortized update time is to simply recompute a (1 + &#491;)-approximate matching from scratch in O(m) time <ref type="bibr">[DP14]</ref> every after 2&#491;m/n edge updates. 1 Surprisingly, this easy O(n) bound already captures the limitation of all known techniques! An improved algorithm with O( &#8730; m) update time was given ten years ago by Gupta and Peng <ref type="bibr">[GP13]</ref>, but it still takes O(n) time in dense graphs. Very recently, Assadi et al. <ref type="bibr">[ABKL23]</ref> showed how to obtain O(n/(log * n) &#8486;(1) ) update time, but their regularity-lemma-based approach inherently cannot give an improvement larger than a 2 &#920;( &#8730; log n) = n o(1) factor. Until now, no dynamic (1+&#491;)-approximate algorithms can break through the naive O(n) barrier by a polynomial factor. 2  Intensive research during the last decade instead showed how to speed up update time by relaxing the approximation factor. The influential work by Onak and Rubinfeld <ref type="bibr">[OR10]</ref> gave the first dynamic matching algorithm with polylogarithmic update time that maintains a large constant approximate maximum matching. Then, Baswana, Gupta and Sen <ref type="bibr">[BGS11]</ref> showed a dynamic maximal matching with logarithmic update time, which gives 2-approximation. A large body of work then refined this result in various directions, including constant update time <ref type="bibr">[Sol16]</ref>, deamortization <ref type="bibr">[CS18,</ref><ref type="bibr">BFH19,</ref><ref type="bibr">Kis22]</ref>, and derandomization [BHN16, ACC + 18, BK19, Waj20, BK21]. In 2015, Bernstein and Stein <ref type="bibr">[BS15,</ref><ref type="bibr">BS16]</ref> showed a novel approach for maintaining a (3/2 + &#491;)-approximate matching using &#213;(m 1/4 ) = &#213;( &#8730; n) update time. 3 Refinement of this approach and new trade-off results with approximation in the range (3/2, 2) were also intensively studied [BLM20, GSSU22, Kis22, BK22, RSW22]. All these techniques, however, seem to get stuck at (3/2)-approximation.</p><p>Very recently, the above long-standing trade-off was improved by Behnezhad <ref type="bibr">[Beh23]</ref> and, independently, by Bhattacharya et al. <ref type="bibr">[BKSW23]</ref> via a new connection to sublinear and streaming algorithms. To maintain maximum matching size, they gave 1.973-approximation algorithms with polylogarithmic update time, and, on bipartite graphs, Behnezhad <ref type="bibr">[Beh23]</ref> pushed it further to (3/2 -&#8486;(1))-approximation in &#213;( &#8730; n) update time. While this new connection is very inspiring,</p><p>1 See Appendix C for the proof of this simple algorithm. 2 In contrast, in the partially dynamic setting where graphs undergo edge insertions only or edge deletions only, there are many algorithms with polylogarithmic amortized update time [GRS14, GLS + 19, BGS20, JJST22, BKSW23]. 3 We use &#213;(&#8226;) to hide polylog(n) factor throughout the paper.</p><p>it has been a key open problem <ref type="bibr">[BRRS23]</ref> whether non-trivial (1 + &#491;)-approximate matching algorithms in dense graphs exist in the sublinear model. Hence, it remains unclear whether an improved dynamic (1 + &#491;)-approximation algorithm is possible via this new connection or even possible at all. Indeed, in this paper, we give the first dynamic (1 + &#491;)-approximate matching size algorithm that finally improves the O(n) bound by a polynomial factor, formally stated below.</p><p>Theorem 1.2. There is a dynamic (1 + &#491;)-approximate matching size algorithm with m 0.5-&#8486;&#491;(1) worst-case update time.</p><p>The algorithm is randomized and works against an adaptive adversary with high probability. Moreover, the algorithm maintains (1 + &#491;)-approximate matching M of G in the sense that, given a vertex v, it can return a matched edge (v, v &#8242; ) &#8712; M or &#8869; if v / &#8712; V (M ) in m 0.5+f (&#491;) time, where f is an increasing function such that f (&#491;) &#8594; 0 when &#491; &#8594; 0.</p><p>It has been asked repeatedly <ref type="bibr">[GP13,</ref><ref type="bibr">BS15,</ref><ref type="bibr">BS16]</ref> whether there exists a dynamic (1 + &#491;)approximate matching algorithm with m 0.5-&#8486;&#491;(1) update time. Theorem 1.2 thus gives an affirmative answer to the value version of this open question. Although the matching is not explicitly maintained in Theorem 1.2, it still supports queries whether a vertex is matched or not. The recent algorithms that only maintain the estimate of &#181;(G) by <ref type="bibr">[Beh23,</ref><ref type="bibr">BKSW23]</ref> inherently cannot support this query.</p><p>We obtain Theorem 1.2 by making progress in sublinear algorithms: we show the first sublinear (1, &#491;n)-approximate matching algorithm with truly sublinear time even in dense graphs. Here, an (&#945;, &#946;)-approximate matching means a matching of size at least &#181;(G)/&#945;-&#946;. Given our new sublinear matching algorithm summarized below, Theorem 1.2 follows using known techniques.</p><p>Theorem 1.3. There is a randomized algorithm that, given the adjacency matrix of a graph G, in time n 2-&#8486;&#491;(1) computes with high probability a (1, &#491;n)-approximation &#956; of &#181;(G).</p><p>After that, given a vertex v, the algorithm returns in n 1+f (&#491;) time an edge</p><p>where M is a fixed (1, &#491;n)-approximate matching, where f is an increasing function such that f (&#491;) &#8594; 0 when &#491; &#8594; 0.</p><p>We note that the additive approximation factor in Theorem 1.3 is unavoidable for sublinear algorithms with access to only the adjacency matrix: checking whether there is zero or one edge requires &#8486;(n 2 ) adjacency matrix queries.</p><p>Behnezhad et al. <ref type="bibr">[BRRS23]</ref> posted an open question about sublinear matching algorithms as follows "ruling out say a 1.01-approximation in n 2-&#8486;(1) time would also be extremely interesting."<ref type="foot">foot_0</ref> . Since the additive approximation factor is unavoidable for algorithms using the adjacency matrix only, the analogous question becomes whether one can rule out a (1, n/100)-approximation in n 2-&#8486;(1) time. Theorem 1.3 answers this question negatively since we can get arbitrarily good additive approximation in n 2-&#8486;(1) time.</p><p>To put Theorem 1.3 into the larger context of sublinear matching literature, let us discuss its history below. We use &#8710; and d to denote the maximum and average degree of the graph respectively.</p><p>Approximating &#181;(G). One of the main goals in this area, initiated by Parnas and Ron <ref type="bibr">[PR07]</ref>, is to approximate the size of maximum matching &#181;(G) in sublinear time when given access to the adjacency list and matrix of an input graph. Early research on this topic focused on obtaining O(1) time algorithms when &#8710; = O(1). However, these early work [PR07, NO08, YYI12] may require &#8486;(n 2 ) time on general graphs. This drawback was first addressed in <ref type="bibr">[KMNFT20]</ref> and <ref type="bibr">[CKK20]</ref> (based on <ref type="bibr">[ORRR12]</ref>), both of which were then subsumed by the algorithms of Behnezhad <ref type="bibr">[Beh22]</ref> that compute a (2, o(n))-approximation in &#213;(d+1) time. His algorithms are near-optimal and settle the problem in the regime of approximation ratio at least 2.</p><p>Subsequent work focuses on optimizing the approximation ratio within n 2-&#8486;(1) time. To compare with Theorem 1.3, let us discuss only algorithms that use the adjacency matrix. Behnezhad et al. <ref type="bibr">[BRRS23]</ref> first broke the 2-approximation barrier by computing a (2 -&#8486; &#947; (1), o(n))-approximate matching in &#213;(n 1+&#947; ) time. Then (3/2, &#491;n)-approximation algorithms with n 2-&#920;(&#491; 2 ) time were shown independently in <ref type="bibr">[BKS23,</ref><ref type="bibr">BRR23]</ref>. <ref type="bibr">Behnezhad et al. [BRR23]</ref> improved this further to (3/2 -&#8486;(1), o(n))-approximation in n 2-&#8486;(1) time on bipartite graphs. <ref type="foot">5</ref> We summarize the previous work in Table <ref type="table">1</ref>.</p><p>By the first part of Theorem 1.3, we show that even (1, &#491;n)-approximation is possible in n 2-&#8486;&#491;(1) time. As we mentioned, this result addresses the open question of <ref type="bibr">[RSW22]</ref>. It remains very interesting to see an optimal approximation-time trade-off for this problem.</p><p>Matching Oracles. In the area of local computation algorithms (LCA), initiated by Robinfeld et al. <ref type="bibr">[RTVX11,</ref><ref type="bibr">ARVX12]</ref>, we want a matching oracle for some fixed approximate matching M such that, given any vertex v, return</p><p>The goal is to optimize the approximation ratio of M and minimize the worst-case query time over all vertices. Note that, given a matching oracle for an &#945;-approximate matching, we can compute (&#945;, &#491;n)-approximation of &#181;(G) by simply querying the oracle at O(1/&#491; 2 ) random vertices. So this is stronger than the previous goal.</p><p>The worst-case guarantee over all vertices is stronger than the expected query time for each vertex <ref type="bibr">[NO08]</ref> or for just a random vertex <ref type="bibr">[YYI12,</ref><ref type="bibr">Beh22]</ref>, which is even weaker. This strong guarantee is useful for bounding the query time of adaptive queries, which depend on answers of the previous queries, and is crucial in some applications <ref type="bibr">[LRV22]</ref>. Our approach for "boosting" the approximation ratio also requires adaptive queries and hence needs worst-case guarantees.</p><p>A long line of work [RTVX11, ARVX12, RV16, LRY15, Gha16, GU19, Gha22] focused on building an oracle for maximal independent sets (which implies a 2-approximate matching oracle) and culminated in an oracle by <ref type="bibr">Ghaffari [Gha22]</ref> that uses poly(&#8710; log n) query time with high probability. <ref type="bibr">Levi et al.</ref> [LRY15] also a showed (1 + &#491;)-approximate matching oracle with &#8710; O(1/&#491; 2 ) polylog(n) query complexity. However, all these algorithms are not sublinear in dense graphs. In this regime, the only non-trivial matching oracle was by Kapralov et al. <ref type="bibr">[KMNFT20]</ref> and has &#213;(&#8710;) query time, but the approximation ratio is only a large constant and is in expectation. We summarize the previous work in Table <ref type="table">2</ref>.</p><p>The second part of Theorem 1.3 gives the first non-trivial matching oracle on dense graphs whose multiplicative approximation ratio is a small constant, which is 1 in our case, but we need to pay additive approximation factor. Summary. Our main result, Theorem 1.2, is the first dynamic (1 + &#491;)-approximate matching size algorithm with m 0.5-&#8486;&#491;(1) update time, breaking through the naive yet long-standing O(n) barrier by a polynomial factor. Our key technical component, Theorem 1.3, makes progress in the area of sublinear-time matching algorithms on dense graphs. Among algorithms for approximating &#181;(G) only, we improve the best approximation ratio from (3/2-&#8486;(1), o(n)) by <ref type="bibr">[BRR23]</ref> to (1, &#491;n). Among LCAs, it is the first one on dense graphs whose multiplicative approximation is a small constant.</p><p>Organization. First, we give an overview of our algorithms in Section 2. Then, we set up notations and give preliminaries in Section 3. In Section 4, we present a key building block which is a matching oracle for an induced graph G[A] where A is unknown to us. Using this, we show in Section 5 how to boost the approximation ratio of any matching oracle. By repeatedly boosting the approximation ratio, we give a (1, &#491;n)-approximate matching oracle (Theorem 1.3) in Section 6. Finally, we combine this oracle with known techniques in dynamic algorithms to Theorem 1.2 in Section 7.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgement</head><p>We would like to thank David Wajc for suggesting the implementation of McGregor's algorithm in the sub-linear model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Technical Overview</head><p>Our high-level approach is based on the interconnection between dynamic, sublinear, and streaming algorithms. This connection differs from the ones used in the recent results of <ref type="bibr">[BKSW23,</ref><ref type="bibr">Beh23]</ref>. For example, the dynamic (2 -&#8486;(1))-approximate algorithms in <ref type="bibr">[BKSW23,</ref><ref type="bibr">Beh23]</ref> are inspired by the two-pass streaming algorithms (e.g. <ref type="bibr">[KMM12]</ref>). Then, they use sublinear algorithms <ref type="bibr">[Beh22]</ref> to implement this streaming algorithm in the dynamic setting efficiently. <ref type="foot">6</ref> In contrast, it is our sublinear algorithm, not dynamic algorithm, that is inspired by the O(1)-pass streaming algorithm <ref type="bibr">[McG05]</ref>. Below, we explain the overview of our sublinear algorithm, which consists of two key ingredients, and then explain how our dynamic algorithm easily follows.</p><p>Ingredient I: Reduction from (1, &#947;n)-Approximation to Arbitrarily Bad Approximation.</p><p>An initial observation is that the streaming algorithm by McGregor <ref type="bibr">[McG05]</ref> can be viewed as the following reduction: one can compute a (1 + &#947;)-approximate matching by making O &#947; (1) calls to a subroutine that, given S &#8838; V , returns a O(1)-approximate matching of the induced subgraph G[S].</p><p>We observe that a much weaker subroutine suffices when additive approximation is allowed. Let LargeMatching(S, &#948;) be a subroutine that, given S &#8838; V and &#948;, returns a matching</p><p>Note that the approximation of M can be arbitrarily bad depending of &#948;. By adapting McGregor's algorithm, we show how to compute a (1, &#947;n)-approximate matching using only t = O &#947; (1) calls to LargeMatching(S 1 , &#948; 1 ), . . . , LargeMatching(S t , &#948; t )</p><p>where each &#948; i is a small constant depending on &#947;. This algorithm, denoted by Alg(&#947;), is our template algorithm (detailed in Section 5.1), which we will try to implement in the sublinear setting.</p><p>Additionally, we observe that each vertex set S i can be determined in a very local manner. More precisely, a membership-query of the form "is a vertex v &#8712; S i ?" can be answered by making only q = O &#947; (1) matching-queries of the form "is a vertex u &#8712; V (M j )? if so, return (u, u &#8242; ) &#8712; M " where j &lt; i and M j is the output of LargeMatching(S j , &#948; j ) previously computed.</p><p>However, the big challenge in the sublinear model, unlike the streaming model, is that even the weak subroutine like LargeMatching(&#8226;) is impossible. <ref type="foot">7</ref> Even worse, if we could not compute each matching M j explicitly for j &lt; i, then how can we answer a membership-query whether v &#8712; S i ? Note that known sublinear algorithms for estimating the matching size of G[S] are not useful here.</p><p>The above obstacle leads us to our second ingredient. We show that at least the oracle version of LargeMatching(&#8226;) can be implemented in the sublinear model. Later, we will explain why it is strong enough for implementing the template algorithm Alg(&#947;) in the sublinear model.</p><p>Ingredient II: Large Matching Oracles on Induced Subgraphs. Suppose that a vertex set A &#8838; V is unknown to us but a membership-query of A, i.e., checking if v &#8712; A, can be done in n 1+&#491; time. Given access to the adjacency matrix of G, we show how to construct an oracle LargeMatchingOracle(A, &#948;, &#491;) with the following guarantee: Using &#213;&#948; n 2-&#491; preprocessing time, we obtain an oracle that supports matching-queries for a matching M in G[A] with &#213;&#948; n 1+g(&#491;) query time where</p><p>The main challenge of implementing LargeMatchingOracle(A, &#948;, &#491;) in the sublinear model is that we want to find a large matching on the induced subgraph G[A]. The challenge comes from possible &#8486;(n 2 ) edges between A and V \ A, and we must avoid reading these edges to get sublinear time. It turns out that this challenge can be overcome. We use the idea that appeared before in the algorithm of <ref type="bibr">[BRR23]</ref> in a different context of estimating (3/2 -&#8486;(1))-approximation &#181;(G) on bipartite graphs. See the details in Section 4.</p><p>Given the above two ingredients, we can combine them to get our main results in the sublinear and dynamic settings, as follows.</p><p>Result I: (1, &#947;n)-Approximate Matching Oracles in n 2-&#8486;&#947; (1) Time. Now, we show how to implement the template algorithm Alg(&#947;) in n 2-&#8486;&#947; (1) time. Let &#491; &#8712; (0, 1) be a small constant where lim &#947;&#8594;0 &#491; = 0. Let &#491; 0 = &#491; and &#491; i = g(&#491; i-1 ) for all i &#8712; [1, t] where g is the function in the guarantee of Ingredient II. So &#491; = &#491; 0 &#8804; &#491; 1 &#8804; &#8226; &#8226; &#8226; &#8804; &#491; t and lim &#947;&#8594;0 &#491; t = 0.</p><p>We simply replace each call to LargeMatching(S i , &#948; i ) with LargeMatchingOracle(S i , &#948; i , &#491; i-1 ). Now, by induction on i &#8712; [1, t], we will show that we can support membership-queries for S i in &#213;&#947; n 1+&#491; i-1 time and matching-queries for M i in &#213;&#947; n 1+&#491; i time. Let us ignore the base case as it is trivial. For the induction step, we have the following:</p><p>1. To answer a membership-query for S i , the template algorithm only needs to make q = O &#947; (1) matching-queries to M j where j &lt; i. So the total query time is q &#8226; &#213;&#947; (n 1+&#491; i-1 ) = &#213;&#947; (n 1+&#491; i-1 ).</p><p>2. To answer a matching-query for M i , the oracle LargeMatchingOracle(S i , &#948; i , &#491; i-1 ) for the matching M i has query time &#213;&#947; n 1+g(&#491; i-1 ) = &#213;&#947; n 1+&#491; i .</p><p>The total preprocessing time we need for LargeMatchingOracle(&#8226;) to implement all the t rounds is</p><p>. At the end of the last round, we can support matchingqueries for the (1, &#947;n)-approximate matching M returned by Alg(&#947;) in &#213;&#947; (n 1+&#491;t ) time, where lim &#947;&#8594;0 &#491; t = 0.</p><p>To get a (1, &#947;n)-approximate estimate &#956; of &#181;(g), we sample &#213;(1/&#947; 2 ) vertices and check if they are matched under M . Whp, this is a (1, &#920;(&#947;)n)-approximation of &#181;(G) because M is (1, &#947;n)approximate.</p><p>Result II: Dynamic (1 + &#947;)-Approximate Matching Size.</p><p>Our dynamic matching size algorithm now follows from standard techniques. Using the well-known vertex reduction technique (see, for example, Corollary 4.9 of <ref type="bibr">[Kis22]</ref>), we can assume that &#181;(G) &#8805; &#947;n at all times. We work in phases, where each phase lasts for &#947; 2 n updates. At the start of each phase, we invoke the sublinear algorithm from Result I above, to obtain a (1, &#947; 2 n)-approximate estimate &#956; of &#181;(G), in n 2-&#8486;&#947;(1) time. Since &#181;(G) &#8805; &#947;n and since the phase lasts for only &#947; 2 n updates, this &#956; continues to remain a purely multiplicative (1+&#920;(&#947;))-approximate estimate of &#181;(G) throughout the duration of the phase. This leads to an amortized update time of n 2-&#8486;&#947;(1) /(&#947; 2 n) = n 1-&#8486;&#947; (1) . In Section 7, we show how to extend this approach to prove Theorem 1.2.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Notations and Preliminaries</head><p>Unless speficied otherwise, the input graph G = (V, E) will have n nodes and m edges. A matching M &#8838; E is a subset of edges that do not share any common endpoint. We use the symbol &#181;(G) to denote the size of a maximum matching in G. We say that a pah p = (v 0 , v 1 , . . . , v i ) is an alternating path in G w.r.t. a matching M &#8838; E iff (v j , v j+1 ) &#8712; E for all j &#8712; [0, i -1] and the edges in the path p alternate between being in M and in E \ M . We say that p is an augmenting path in G w.r.t. M iff p is an alternating path whose first and the last edges are both unmatched in M . The length of a path is the number of edges in it. We let V (M ) denote the set of matched nodes in a matching M &#8838; E. Consider any node v &#8712; V (M ) and suppose that (u, v) &#8712; M . Then we say that u is the mate of v in M . Given a subset of nodes S &#8838; V , G[S] denotes the subgraph of G induced by S. Given any graph G &#8242; , the symbol E(G &#8242; ) denotes the set of edges in G &#8242; .</p><p>Throughout the paper, the symbol &#920; k,&#947; (1) will denote any positive constant that depends only on k and &#947; (where k and &#947; are constant parameters whose values will be chosen later on). We analogously use the notation &#920; k (1) to denote a constant that depends only on k. Finally, the symbol &#213;(.) will be used to hide any polylog(n) factors.</p><p>Oracles. We have the adjacency matrix access to the input graph G. Each query takes O(1) time. We do not have the adjacency list access to the input graph.</p><p>For any vertex set A &#8834; V , an A-membership oracle mem A : V &#8594; {0, 1} indicates whether v &#8712; A for any v &#8712; V . That is, we have mem A (v) = 1{v &#8712; A}.</p><p>A matching oracle match M : V &#8594; V 2 &#8746; {&#8869;} for a matching M is an oracle that, given a vertex v &#8712; V , returns</p><p>Similarly, a mate oracle mate M : V &#8594; V &#8746; {&#8869;} for a matching M is an oracle that, given a vertex v &#8712; V , returns</p><p>Concentration Bounds. We need standard concentration bounds as follows.</p><p>Proposition 3.1 (Hoeffding bound). Let X 1 , . . . , X n be independent random variables such that a &#8804; X i &#8804; b. Let X = n i=1 X i . For any t &gt; 0,</p><p>Proposition 3.2 (Chernoff bound). Let X 1 , . . . , X n be independent {0, 1}-random variables. Let X = n i=1 X i where E[X] &#8804; &#181; For any t &gt; 0 where t &#8804; &#181;,</p><p>Chernoff bound can be much stronger than Hoeffding bound when E[X] has small upper bound. For example, if we applied Proposition 3.1 to the setting for Proposition 3.2, we would only get that the bound of 2 exp(-2t 2 3n ) which is much weaker than 2 exp(-t 2 3&#181; ) when &#181; &#8810; n.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Matching Oracles of Induced Subgraphs</head><p>In this section, we present the key subroutine of this paper. The goal is to construct a matching oracle for an induced subgraph G[A] but A is unknown to us; we only have access to an A-membership oracle mem A .</p><p>Theorem 4.1. Let G = (V, E) be a graph, A &#8838; V be a vertex set. Suppose that we have access to adjacency matrix of G and an A-membership oracle mem A with t A query time. We are given as input &#491; &gt; 0 and &#948; in &gt; 0.</p><p>We can preprocess G in &#213;((t A + n)(n 1-&#491; + n 4&#491; )/poly(&#948; in )) time and either return &#8869; or construct a matching oracle match M (&#8226;) for a matching M &#8834; G[A] of size at least &#948; out n where &#948; out = &#948; 5 in /10 8 that has &#213;((t A + n)n 4&#491; /poly(&#948; in )) worst-case query time. If &#181;(G[A]) &#8805; &#948; in n, then &#8869; is not returned. The guarantee holds with high probability.</p><p>The very important property of Theorem 4.1 is that it makes n 1-&#491; oracle calls to mem A during preprocessing and only n O(&#491;) calls to mem A on each query. The rest of this section is devoted for proving Theorem 4.1.</p><p>To prove Theorem 4.1, we adapt the technique used inside the algorithm by Behnazhad et al. <ref type="bibr">[BRR23]</ref> for (3/2 -&#8486;(1))-approximating &#181;(G) on bipartite graphs. We observe that the idea there has reach beyond (3/2 -&#8486;(1))-approximation algorithms. The abstraction of that idea leads us to Theorem 4.1, the crucial subroutine for later parts of our paper.</p><p>This section is organized as follows. In Section 4.1, we show a weaker version of Theorem 4.1 that works well on low degree graphs. We will use this weaker version in the preprocessing step, described in Section 4.2. Then, we complete the query algorithm in Section 4.3.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Oracles on Low Degree Graphs</head><p>Here, we show a similar result as Theorem 4.1, but it is efficient only when the maximum degree &#8710; is small. In particular, the query algorithm makes n O(&#491;) calls to mem A only when &#8710; = n O(&#491;) . Lemma 4.2. Let G = (V, E) be a graph with maximum degree &#8710; where &#8710; is known and A &#8838; V be a vertex set. Suppose that we have access to adjacency matrix of G and an A-membership oracle mem A with t A query time. We can construct in &#213;((t</p><p>The proof of Lemma 4.2 is based on the the following (2, &#491;n)-approximate matching oracle given access to adjacency list. Lemma 4.3 is proved by combining an improved analysis of randomized greedy maximal matching of Behnezhad <ref type="bibr">[Beh22]</ref> into a framework for constructing an LCA by <ref type="bibr">[LRY15]</ref>. We do not claim any novel contribution here and defer the proof to Appendix A. Now, to prove Lemma 4.2, we need to strengthen Lemma 4.3 in two ways. First, it must work with the adjacency matrix, not the adjacency lists. Second, it must return a large matching of an induced subgraph G[A], not that of G. However, this can be done using a simple simulation.</p><p>Proof of Lemma 4.2. Let A denote the algorithm of Lemma 4.3. We simulate A on G[A] with parameter d &#8592; &#8710; as follows.</p><p>Whenever A needs to sample a vertex, we sample O(log(n)/&#491;) vertices in G and call mem A on each of them. If one of them is in A, then we get a random vertex in G[A]. If none of them is in A, then w.h.p. |A| &#8804; &#491;n. If this ever happens, even an empty matching is a (2, &#491;n)-approximate matching in G[A], and the problem becomes trivial.</p><p>Whenever A needs to make queries to the adjacency list of any vertex v, we can construct the whole adjacency list of v in G[A] by first making n adjacency matrix queries to learn all neighbors of v in G and then makes deg(v) &#8804; &#8710; oracles calls to mem A to know which neighbors are in G[A]. This takes O(t A &#8710; + n) time. Every other computation can be simulated without the overhead.</p><p>Therefore, each step of A can be simulated with an extra</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Preprocessing</head><p>We describe the preprocessing algorithm in Algorithm 1 with the guarantees summarized in the lemma below.</p><p>that satisfies one of the following:</p><p>The algorithm also reports which properties above M &#8242; satisfies. If &#181;(G[A]) &#8805; &#948; in n, then &#8869; is not returned with high probability.</p><p>In Algorithm 1, the remaining set V &#8242; is initialized as V and only shrinks. For convenience, we let A &#8242; := A &#8745; V &#8242; and D &#8242; := D &#8745; V &#8242; denote the remaining alive and dead vertices.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.1">Correctness</head><p>In this part, we prove the correctness of Algorithm 1 assuming that it does not return &#8869;. We first show that &#956;1 and &#956;2 are good approximation of M i [A] and M i base on basic there definition and Hoeffding's bound.</p><p>Lemma 4.5. For every i, we have</p><p>and by Hoeffding bound Proposition 3.1, we have</p><p>Repeat the following for T times:</p><p>1. Sample kp distinct pairs of vertices from V &#8242; . Partition the sampled pairs into (P 1 , . . . , P k )</p><p>where each P i is an ordered list containing p pairs of vertices.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">For</head><p>(b) Let M i be the greedy maximal matching when scanning E i in order. \\Case 1:</p><p>\\Case 2:</p><p>Return &#8869; (Error).</p><p>Lemma 4.6. For every i, we have</p><p>Proof. The probability that a random vertex from</p><p>2r 2 -&#948;outn 2 and by Hoeffding bound Proposition 3.1, we have</p><p>Next, we show the "sparsification" property of randomized greedy maximal matching</p><p>should not exist because it should have been matched by M i via one of the sampled edges. The proof is similar to Lemma 3.1 of <ref type="bibr">[BFS12]</ref>, which considers this sparsification property of the randomized greedy maximal independent set, instead of maximal matching.</p><p>Lemma 4.7. For every i, the maximum degree of</p><p>Proof. Let us describe an equivalent way to construct M i . Initialize M i = &#8709; and then sample p pairs of vertices in V &#8242; . For each sampled pair</p><p>and both u and v are not matched by M i , then we add (u, v) into M i . At the end of this process, we will show that, for any</p><p>, let M i t denote the matching M i after we sampled the t-th pair. For convenience, we denote</p><p>as the degree of v at time t. We want to show that Pr[deg</p><p>, the probability that v remained at unmatched after time t is</p><p>In particular, the probability that</p><p>From the above lemmas, we can conclude the correctness of the algorithm.</p><p>Corollary 4.8. If Algorithm 1 returns a matching M &#8242; , then M &#8242; satisfies the guarantees from Lemma 4.4 w.h.p.</p><p>Proof. If M &#8242; is returned under Case 1. Then, by Lemma 4.5, we have</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.2">Termination without Error</head><p>In this part, we show that if &#181;(G[A]) &#8805; &#948; in n, then Algorithm 1 does not return &#8869; w.h.p. For any graph G and</p><p>Our high-level plan is that we will show that D &#8242; decreases its size by &#920;(&#948; 2 in n) in each iteration in the repeat loop. So D &#8242; must become very small after T = &#920;(1/&#948; 2 in ) iterations. But when this happens, we can show that either Case 1 or Case 2 must happen and so the algorithm must terminate without error.</p><p>To carry out the above plan, we need a helper lemma (Lemma 4.9) which states that, even if V &#8242; keeps shrinking, the maximum matching in G[V &#8242; ] remains large, &#181;(G[A &#8242; ]) &#8805; &#948; in n/2. We will need this fact throughout the whole argument.</p><p>The high-level argument goes as follows. If the algorithm does not return M &#8242; , then M i [A &#8242; ] is very small for all i and so G[A &#8242; ] contains few edges. It follows that the set C of removed vertices contains only few vertices in</p><p>. Thus, we remove only few vertices from A &#8242; and so the size of &#181;(G[A &#8242; ]) cannot decrease too much. The formal argument below goes through the set A &#8242; sp and the above paragraph gives the right intuition.</p><p>, if the algorithm does not terminate yet.</p><p>Proof. We prove by induction on &#964; . For &#964; = 0 (i.e. the beginning the algorithm), the claim holds as &#181;(G[A]) &#8805; &#948; in n. Next, we consider &#964; &#8805; 1. By induction hypothesis, at the beginning of the &#964; -th iteration, we have </p><p>) is a sum of r 3 independent random variable whose range is k (since the maximum degree in G and G[A &#8242; ] is k), by Hoeffding bound Proposition 3.1, we have</p><p>. This means that we remove at most &#948; in n 2T vertices from A &#8242; at the end of the &#964; -th iteration. So the size of maximum matching in G[A &#8242; ] may decrease by at most &#948; in n 2T . Thus,</p><p>Given Lemma 4.9, we will use the following lemma to argue that if Algorithm 1 does not return M i , then then M i must match many vertices between A &#8242; and D &#8242; .</p><p>Proof. Let M * be the maximum matching in G[A &#8242; ] of size at least &#948; in n/2. We partition edges in M * into two parts:</p><p>Lemma 4.10 also says that once D &#8242; become small enough, Algorithm 1 will not err w.h.p.</p><p>approximate matching. So &#956;2 &#8805; 6&#948; out w.h.p. by Lemma 4.6. In either cases, so Algorithm 1 must return a matching M &#8242; at Line 2f or Line 2k. Now, we are ready to show that D &#8242; must shrink significantly after each iteration of the repeat loop, which means that there cannot be too many iterations before the algorithm terminate by Corollary 4.11. Lemma 4.12.</p><p>There are two main claims in the proof of Lemma 4.12. We suggest the reader to skip the proofs of these claims and see how they are used to prove Lemma 4.12 first.</p><p>Proof. At the end of the &#964; -th iteration, since Algorithm 1 did not terminate at Step 2f nor Step 2k, we have, w.h.p., that M [A &#8242; ] &lt; 3&#948; out n by Lemma 4.5 and</p><p>and we have that</p><p>| is a sum of r 3 independent random variable whose range is k (since the maximum degree in G and G[A &#8242; ] is k), by Hoeffding bound Proposition 3.1, we have</p><p>Proof. We have</p><p>applying by Chernoff bound Proposition 3.2 where t = 1000 log n and &#181; = 1000 log n<ref type="foot">foot_4</ref> , we have</p><p>Now, let us prove Lemma 4.12 using the above claims.</p><p>Proof of Lemma 4.12. Observe that</p><p>where the inequality holds w.h.p. by Claim 4.14. Since |D &#8242; \ C| &#8804; n and &#951; = &#948; 2 in log(n)/10, we have by Claim 4.13 that</p><p>and so |D &#8242; &#8745; C| &#8805; &#948; 2 in 100 n as desired. Finally, we give the conclusion of this part.</p><p>/2 w.h.p. by Lemma 4.9. So, by Lemma 4.12 D &#8242; decreases its size by &#948; 2 in n/100 in each iteration in the repeat loop. Hence, we have that |D &#8242; | &#8804; &#948; in n/3 before T = 100/&#948; 2 in iterations. Therefore, there is an iteration &#964; &#8712; [1, T ] where Algorithm 1 will return a matching M &#8242; w.h.p. by Corollary 4.11.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.3">Preprocessing Time</head><p>Consider the (2, &#948; out n)-approximate matching oracle match low (&#8226;) in Line 2g, which is given graph</p><p>By Lemma 4.7, we can assume w.h.p. that G[V &#8242; \ V (M i )] has degree at most n 2&#491; . Lemma 4.2 implies the following: Proposition 4.16. Both preprocessing and query time of match low (&#8226;) is at most &#213;((t</p><p>Lemma 4.17. Algorithm 1 takes &#213;((t A + n)(n 1-&#491; + n 4&#491; )/poly(&#948; in )) total running time.</p><p>Proof. We will analyze the total running time for each iteration of the repeat-loop in Algorithm 1.</p><p>Since there are T = O(1/&#948; 2 in ) iterations and we assume &#948; in &#8805; 1/poly log n, the running time is the same up to polylogarithmic factor. Now, fix one iteration of the repeat-loop.</p><p>The total time to compute M i , for all i &#8804; k, is &#920;(kp) = &#213;(n 2-&#491; ). For each for-loop iteration, to compute &#956;1 , we make &#920;(r 1 ) queries to mem A taking &#920;(r 1 ) &#8226; t A = &#213;(t A /&#948; 10 in ) time. To compute &#956;2 , we make r 2 queries to match low (&#8226;). By Proposition 4.16, this takes time</p><p>&#948; 25 in ) by Lemma 4.2. Next, we analyze the time to compute A &#8242; sp . Since |A &#8242; | &#8805; &#948; in n w.h.p. by Lemma 4.9, we can sample a random vertex in A &#8242; by sampling at most O(log n/&#948; in ) times in V &#8242; w.h.p. For each sample, we need to make a query to mem A , so we can compute A</p><p>To conclude, the total running time in each iteration of the repeat-loop at most</p><p>The main lemma on preprocessing, Lemma 4.4, is implied by combining Corollary 4.8, Corollary 4.15 and Lemma 4.17</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Query Algorithm</head><p>We define our matching oracle match depending on the cases from Lemma 4.4.</p><p>Suppose Lemma 4.4 returns</p><p>The algorithm for outputting match(v) with respect to M 1 is described in Algorithm 2. The correctness is straightforward and the worst-case query time is clearly <ref type="foot">9</ref> The algorithm for outputting match(v) with respect to M 2 is described in Algorithm 3. The correctness is straightforward. Let us analyze the query time.</p><p>Step 1 takes t A + O(1) time.</p><p>Step 2 takes &#213;((t A + n)n 4&#491; /&#948; 3 out ) following the same proof as in Proposition 4.16 (the maximum degree of G[V &#8242; \ V (M &#8242; )] is at most n 2&#491; w.h.p. by Lemma 4.7).</p><p>Algorithm 3 Compute match(v) with respect to M 2 . Let match low be the (2, &#948; out n)-approximate matching oracle from Lemma 4.2 when given graph</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Using the oracle match</head><p>In both cases, the matching oracle match respects a matching of size at least &#948; out n and has worst-case query time at most &#213;((t A + n)n 4&#491; /poly(&#948; in )) w.h.p.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Boosting the Approximation Guarantee of a Matching Oracle</head><p>Recall the notations from Section 3. Throughout this section, we use the following parameters.</p><p>) is a sufficiently large integral constant that depends only on k and &#947; (see Lemma 5.12), and &#491; in &gt; 0 is a sufficiently small constant such that 9 T &#8226; &#491; in &lt; 1/5.</p><p>We present an algorithm Augment(G, M in , k, &#947;, &#491; in ), which takes as input: a graph G = (V, E) with n nodes, the parameters k, &#947;, &#491; in as in Definition 5.1, and an oracle match M in (.) for a matching M in in G that has &#213;k,&#947; (n 1+&#491;in ) query time. The algorithm either returns an oracle match M out (.) for a matching M out in G that is obtained by applying a sufficiently large number of length (2k + 1)augmenting paths to M in , or it returns Failure. We now state our main result in this section.</p><p>Theorem 5.2. Set &#491; out := 9 T &#8226; &#491; in (see Definition 5.1). Given adjacency-matrix query access to the input graph G = (V, E), the algorithm Augment(G, M in , k, &#947;, &#491; in ) runs in &#213;k,&#947; n 2-&#491;in time. Further, either it returns an oracle match M out (.) with query time &#213;k,&#947; (n 1+&#491;out ), for some matching</p><p>&#8226; n (we say that it "succeeds" in this case), or it returns Failure. Finally, if the matching M in admits a collection of &#947; &#8226; n many node-disjoint length (2k + 1)-augmenting paths in G, then the algorithm succeeds whp.</p><p>In Section 5.1, we present a template algorithm for the task stated in Theorem 5.2. This is inspired by an algorithm of McGregor <ref type="bibr">[McG05]</ref> for computing a (1 + &#491;)-approximate matching in the semi-streaming model.While describing the template algorithm, we assume that we are given the matching M in explicitly as part of the input, and that we need to either construct the matching M out or return Failure. Note, however, that in the sublinear setting, we cannot assume this.</p><p>Subsequently, in Section 5.2, we show how to implement the template algorithm in the sublinear setting under adjacency-matrix queries, which leads to the proof of Theorem 5.2.</p><p>Remark on Oracles: Throughout this section, we will treat the oracle match M (.) as a data structure in the sublinear model, which returns the appropriate answer upon receiving a query. In contrast, we will treat the oracle mate M (.) as simply an abstract function, so that mate M (v) simply denotes the mate of v (if it exists) under M (see Section 3). Note that we can return the value of mate M (v) by making a single query to match M (v), without any additional overhead in time.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">A Template Algorithm</head><p>We denote the template algorithm simply by Augment-Template(G, M in , k, &#947;), as we do not need the parameter &#491; in to describe it. The parameter &#491; in will become relevant only in Section 5.2, when we consider implementing this algorithm in the sublinear setting.</p><p>As part of the input to the template algorithm, the n-node graph G = (V, E) and the matching M in are specified explicitly. The algorithm either returns an explicit matching M out in G of size</p><p>&#8226; n (we say that it "succeeds" in this case), or it returns Failure. If M in admits a collection of &#947; &#8226; n many node-disjoint length (2k + 1)-augmenting paths in G, then the template algorithm succeeds whp. This mimics Theorem 5.2. Furthermore, the template algorithm has access to a subroutine LargeMatching(S, &#948;), which takes as input a subset of nodes S &#8838; V and a small constant &#948; &#8712; (0, 1), and either returns &#8869; or returns a matching M in G[S] such that |M | &#8805; 1 10 8 &#8226; &#948; 5 &#8226; n. In addition, if &#181;(G) &#8805; &#948; &#8226; n, then it is guaranteed that LargeMatching(G, &#948;) does not return &#8869;. This mimics Theorem 4.1, with &#948; in = &#948;.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.1">Algorithm Description</head><p>Random partitioning: We start by partitioning the node-set V into 2k + 2 subsets L 0 , . . . , L 2k+1 , as follows. For each v &#8712; V , we place the node v into one of the subsets L 0 , . . . , L 2k+1 chosen uniformly and independently at random. We will refer to the subset L i as layer i of this partition. If v &#8712; L i , then we will write &#8467;(v) = i and simply say that the node v belongs to layer i.</p><p>Let p be an augmenting path of length (2k + 1) in G w.r.t. M in . Assign an arbitrary direction to this path, so that we can write p = (v 0 , v 1 , . . . , v 2k+1 ) w.l.o.g. Specifically, we have (v 2i , v 2i+1 ) &#8712; E \ M in for all i &#8712; [0, k], and (v 2i-1 , v 2i ) &#8712; M in for all i &#8712; [1, k]. We say that the path p survives the random partitioning iff v i &#8712; L i for all i &#8712; [0, 2k + 1].</p><p>Lemma 5.3. Consider any collection P of node-disjoint length (2k + 1)-augmenting paths in G w.r.t. M in . Let P * &#8838; P denote the subset of paths in P that survive the random partitioning. If</p><p>Proof. Each path p &#8712; P survives the random partitioning with probability (2k + 2) -(2k+2) . Since |P| &#8805; &#947; &#8226;n, by linearity of expectation, we get:</p><p>1)&#8226;n. Finally, we note that whether a given path p &#8712; P survives the random partitioning or not is independent of the fate of the other paths in P. The lemma now follows from a Chernoff bound.</p><p>Motivated by Lemma 5.3, the template algorithm will only attempt to augment M in along those augmenting paths that survive the random partitioning. This leads us to introduce the notion of layered subgraphs of G, as described below. Intuitively, although the template algorithm does not know the set P * in advance, it can be certain that the sequence of edges in any length (2k + 1)augmenting path in P * appears in successive layered subgraphs (see Observation 5.6).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Layered subgraphs of</head><p>Given the nodes in V H , the edge-set E H &#8838; E is defined as follows. An edge (u, v) &#8712; E belongs to</p><p>We next define the subgraph H := (V H , E H ). Finally, for each i &#8712; [0, 2k], let G i := (V, E i ) be a bipartite subgraph of G, where</p><p>is the set of edges in E H between layer i and layer i + 1. Note that we have defined the subgraphs {G i } over the entire node-set V , although every edge in these subgraphs has both its endpoints in V H . This is done to simplify notations, as will become evident when we describe how to implement our algorithm in the sublinear setting. For each i &#8712; [0, 2k + 1], we refer to the nodes in V i := L i &#8745; V H as being relevant for the concerned layer.</p><p>We now state some key observations, which immediately follow from the description above.</p><p>Observation 5.4. For all i &#8712; [1, 2k], we have V i &#8838; V (M in ).</p><p>Observation 5.5. For all i &#8712; [0, k], we have</p><p>Thus, if i is even, then G i consists of all the edges from G that connect two relevant nodes across the concerned layers. In constrast, if i is odd, then G i consists of the edges from M in that connect two relevant nodes across the concerned layers.</p><p>Observation 5.6. Consider any augmenting path p = (v 0 , v 1 , . . . , v 2k+1 ) w.r.t. M in in G that survives the random partitioning. Then we have</p><p>Nested matchings: Fix any j &#8712; [0, k], and for each i &#8712; [0, j] consider a matching M 2i &#8838; E 2i in G 2i . We say that the sequence of matchings M 0 , M 2 , . . . M 2j is nested iff for all i &#8712; [1, j] and all</p><p>Observation 5.7. Consider any sequence of nested matchings M 0 , M 2 , . . . , M 2k . Then there exists a collection of node-disjoint length (2k + 1)-augmenting paths of size |M 2k | w.r.t. M in in G.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Proof. Consider any node</head><p>which is constructed according to the following procedure.</p><p>&#8226; v 2k+1 &#8592; v, and i &#8592; 2k. (Note that v 2k+1 is at layer 2k + 1 and is matched under M 2k .)</p><p>&#8226; While i &#8805; 0:</p><p>-</p><p>i &#8592; i -1.</p><p>Since the sequence M 0 , M 2 , . . . , M 2k is nested, applying Observation 5.4 and Observation 5.5, we can show (by an induction on the number of iterations of the While loop) that p(v) is a length (2k + 1)augmenting path in G w.r.t. M in . Furthermore, it is easy to see that the paths {p(v)} v&#8712;V (M 2k )&#8745;V 2k+1 constructed in this manner are mutually node-disjoint. This implies the observation.</p><p>Important parameters: We fix a constant &#968; &#8712; (0, 1), which depends on k and &#947;, i.e., &#968; = &#920; k,&#947; (1), and is chosen to be sufficiently small (see Corollary 5.16). Next, for each i &#8712; [0, k], we define:</p><p>Consider any matching M &#8242; between the nodes at layers 2i and 2i+1, where i &#8712; [0, k]. Intuitively, the parameters &#968; i and &#948; i will determine how large M &#8242; needs to be so as to make us "happy". Note that the values of &#948; i and &#968; i decrease in a doubly exponential manner with i. This fact will be crucially used during the analysis in Section 5.1.2.</p><p>A relatively informal summary of the algorithm: Motivated by Observation 5.7, the template algorithm attempts to find a sequence of nested matchings ending at layer 2k. Specifically, the algorithm runs in iterations. At the start of a given iteration, we maintain a sequence of nested matchings M 0 , M 2 , . . . , M 2i up to some layer 2i, such that |M 2j | &#8805; &#968; j &#8226; n for all j &#8712; [0, i]. If i = k, then by Observation 5.7 we can already identify a collection of &#968; k &#8226;n = &#920; k,&#947; (1)&#8226;n many node-disjoint length (2k + 1)-augmenting paths in G w.r.t. M in , and so we just apply those augmenting paths to M in and return the resulting matching M out . Henceforth, assume that i &lt; k. We classify each node in V H as either alive or dead (at the start of the first iteration every node was alive). We also enforce the invariant that all the nodes currently matched in M 0 &#8746; M 2 &#8746; &#8226; &#8226; &#8226; &#8746; M 2i are alive. During the current iteration, we attempt to find a large matching M &#8242; between the alive nodes in G 2i+2 , while ensuring that the sequence M 0 , M 2 , . . . , M 2i , M &#8242; remains nested. Specifically, we make a call to the subroutine LargeMatching(S, &#948; i+1 ), for an appropriate S &#8838; V 2i+2 &#8746; V 2i+3 . Depending on the outcome of this call, we now fork into one of the following three cases.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Case (a):</head><p>The call to LargeMatching(S, &#948; i+1 ) returns a matching M &#8242; . Thus, we are guaranteed that</p><p>We set M 2i+2 := M &#8242; , i := i + 1, and proceed to the next iteration.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Case (b):</head><p>The call to LargeMatching(S, &#948; i+1 ) returns &#8869;, and i = -1 (i.e., the sequence of matchings M 0 , M 2 , . . . , M 2i was empty). Here, we terminate the template algorithm and return Failure.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Case (c):</head><p>The call to LargeMatching(S, &#948; i+1 ) returns &#8869;, and i &#8805; 0. Here, we change the status of all the nodes in V (M 2i ) &#8745; V 2i+1 , along with their matched neighbors under M in (who are at layer 2i + 2), from alive to dead. We then delete the matching M 2i , set i := i -1, and proceed to the next iteration.</p><p>We will need some more notations while working with this algorithm in Section 5.2. Accordingly, below we present a more detailed and technical description of the template algorithm, along with these additional notations. While going through the rest of this section, the reader will find it helpful to refer back to the informal description above, whenever necessary.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Iterations:</head><p>In each iteration t &#8805; 1, we will compute a matching M (t) in the subgraph G &#963;(t) , where &#963;(t) &#8712; {0, 2, 4, . . . , 2k}. The mapping &#963; : T &#8594; {0, 2, . . . , 2k} will be constructed in an online manner, i.e., we will assign the value &#963;(t) only during the t th iteration. We now describe the state of the algorithm at the end of any given iteration.</p><p>At the end of an iteration t, a subset of past iterations &#923; (t) &#8838; [t] are designated as being active w.r.t. t. If &#923; (t) = &#8709;, then we write &#923; (t) := &#955;</p><p>(t) 0 , &#955; (t) 1 , . . . , &#955; </p><p>corresponds to the sequence M 0 , M 2 , . . . , M 2i in the discussion immediately after Observation 5.7. Thus, the algorithm satisfies the following invariants.</p><p>Invariant 5.8. We have stack(t) &#8804; k, and &#963; &#955; (t) j = 2j for all j &#8712; [0, stack(t)].</p><p>Invariant 5.9. The sequence of matchings M</p><p>For each layer i &#8712; [0, 2k + 1], the set of relevant nodes V i is partitioned into two subsets: A i and D i . We refer to the nodes in A i as alive, and the nodes in D i as dead. We let A := 2k+1 i=0 A i and D := 2k+1 i=0 D i respectively denote the set of all alive and dead nodes, across all the layers. The next invariant states that every matched node in an active iteration is alive.</p><p>At the start of the first iteration (when t = 1), every relevant node is alive (i.e., A i = V i and D i = &#8709; for all i &#8712; [0, 2k + 1]). Subsequently, over time the status of a relevant node can only change from being alive to being dead, but not the other way round. Thus, with time, the set D keeps growing, where the set A keeps shrinking. We now explain how to implement a given iteration t. Implementing iteration t: Let i = stack(t -1). If &#923; (t-1) = &#8709;, then we set i = -1. If i = k, then there will be no more iterations, i.e., the algorithm will last for only t -1 iterations. In this scenario, we know that the sequence of matchings M</p><p>Based on this sequence, we identify a collection of |M &#955; (t-1) k | many node-disjoint augmenting paths w.r.t. M in in G, augment M in along those paths (see Observation 5.7), and return the resulting matching M out . Accordingly, from now on we assume that i &#8804; k -1.</p><p>During iteration t, we will attempt to find a large matching M &#8242; in G 2i+2 between two sets of nodes: A 2i+3 and C 2i+2 . Recall that A 2i+3 denotes the alive nodes at layer 2i + 3. We refer to C 2i+2 as the set of candidate nodes for iteration t. Intuitively, we pick as many nodes from V 2i+2 into the set C 2i+2 as possible, subject to two constraints: (i) if we append M &#8242; at the end of the sequence of matchings from the currently active iterations, then the resulting sequence will continue to remain nested, and (ii) the nodes in C 2i+2 are currently alive. This leads us to the following definition.</p><p>We now call the subroutine LargeMatching(C 2i+2 &#8746; A 2i+3 , &#948; i+1 ), in an attempt to obtain a large matching in</p><p>The last equality holds because of Observation 5.5, and since C 2i+2 &#8838; V 2i+2 and A 2i+3 &#8838; V 2i+3 . We set &#963;(t) := 2i + 2. Now, we fork into one of the following three cases.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Case (a):</head><p>The call to LargeMatching(C 2i+2 &#8746; A 2i+3 , &#948; i+1 ) returns a matching M &#8242; . Thus, we are guaranteed that</p><p>We set M (t) := M &#8242; , &#923; (t) := &#923; (t-1) &#8746; {t} and stack(t) := stack(t -1) + 1. This implies that &#955; (t)</p><p>and &#955; (t) stack(t) := t. Henceforth, we refer to this iteration t as a forwarding iteration at layer (2i + 2). We now move on to the next iteration (t + 1).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Case (b):</head><p>The call to LargeMatching(C 2i+2 &#8746; A 2i+3 , &#948; i+1 ) returns &#8869;, and i = -1. Here, the algorithm terminates and returns Failure.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Case (c):</head><p>The call to LargeMatching(C 2i+2 &#8746; A 2i+3 , &#948; i+1 ) returns &#8869;, and i &#8805; 0. Here, we set M (t) := &#8709;. We also change the status of all the nodes in C 2i+2 , along with their matched neighbors under M in (who are at layer 2i+ 1), from alive to dead, and respectively move these nodes from A 2i+1 to D 2i+1 and from A 2i+2 to D 2i+2 . Next, we set &#923; (t) := &#923; (t-1) \ {&#955; (t-1) i } and stack(t) := stack(t -1) -1. This implies that &#955; (t) j := &#955; (t-1) j for all j &#8712; [0, stack(t)]. Henceforth, we refer to this iteration t as a backtracking iteration for layer 2i. We now move on to the next iteration (t + 1).</p><p>Remark: From the above description of the template algorithm, it immediately follows that Invariants 5.8, 5.9, 5.10 and 5.11 continue to hold at the end of each iteration t.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.2">Analysis</head><p>In this section, we analyze the template algorithm, and prove the following lemma.</p><p>Lemma 5.12. The algorithm Augment-Template(G, M in , k, &#947;) runs for at most T = &#920; k,&#947; (1) iterations. It either returns a matching M out in G of size</p><p>&#8226; n (we say that the algorithm "succeeds" in this case), or it returns Failure. Furthermore, if M in admits a collection of &#947; &#8226; n many node-disjoint length (2k + 1)-augmenting paths in G, then the algorithm succeeds whp.</p><p>We start by focusing on bounding the number of iterations (see Corollary 5.14).</p><p>Claim 5.13. There can be at most 1/(&#968; i ) backtracking iterations for layer 2i, where i &#8712; [0, k -1].</p><p>Proof. Consider any backtracking iteration t for layer 2i. Then we have &#963;(t -1) = i, and Invariant 5.10 implies that</p><p>Thus, during iteration t, at least &#968; i &#8226; n nodes at layer (2i + 1) change their status from alive to dead. Since there are at most n nodes at layer (2i + 1), such an event can occur at most 1/(&#968; i ) times.</p><p>Corollary 5.14. The algorithm Augment-Template(G, M in , k, &#947;) has at most &#920; k,&#947; (1) iterations.</p><p>Proof. Let T f , T b and T 0 respectively denote the total number of forwarding iterations across all layers, the total number of backtracking iterations across all layers, and the total number of iterations across all layers that are neither forwarding nor backtracking.We have T 0 = 1 if the template algorithm returns Failure, and T 0 = 0 otherwise.</p><p>We now observe that: there cannot exist a sequence of more than (k + 1) consecutive forwarding iterations, for otherwise, the (k + 2) th forwarding iteration in this sequence would have to take place at a layer &#8805; (2k + 2), which does not exist. Hence, we have:</p><p>, and the total number of iterations is bounded by:</p><p>.</p><p>The second inequality follows from Claim 5.13, and the last equality follows from (1).</p><p>We now move on to showing that if M in admits a collection &#947; &#8226; n many node-disjoint length (2k + 1)-augmenting paths in G, then the template algorithm succeeds whp. Towards this end, let P denote a maximum-sized collection of node-disjoint length (2k + 1)-augmenting paths in G w.r.t. M in . Let P * &#8838; P be the subset of paths in P that survive the random partitioning. If |P| &#8805; &#947; &#8226; n, then Lemma 5.3 guarantees that whp:</p><p>(2)</p><p>At any point in time during the execution of the algorithm Augment-Template(G, M in , k, &#947;), we say that a path p &#8712; P * is alive if all the nodes on p are alive, and we say that the path p is dead otherwise. Let P * A &#8838; P * and P * D = P * \ P * A respectively denote the set of alive and dead paths at any point in time. Just before the start of iteration 1, we have P * A = P * and P * D = &#8709;. Subsequently, a path p &#8712; P * can change its status from alive to dead only during a backtracking iteration (note that this change occurs in only one direction, i.e., a dead path will never become alive). The next claim upper bounds the number of such changes.</p><p>Claim 5.15. During a backtracking iteration for layer 2i, where i &#8712; [0, k -1], at most &#948; i+1 &#8226; n many paths in P * moves from P * A to P * D . Proof. Let t &#8805; 1 denote a backtracking iteration for layer 2i. During iteration t, the algorithm calls</p><p>We have: &#181;(G &#8242; ) &lt; &#948; i+1 &#8226; n, for otherwise the call to LargeMatching(., .) would not have returned &#8869;.</p><p>Just before iteration t, let P &#8242; &#8838; P * A denote the subset of paths in P * A that pass through some node in C 2i+2 . Only the paths in P &#8242; move from P * A to P * D at the end of iteration t. We can, however, form a matching in G &#8242; which contains one edge from each path in P &#8242; . Hence, we have</p><p>This concludes the proof of the claim.</p><p>Corollary 5.16. Let &#968; = &#920; k,&#947; (1) be a sufficiently small constant depending on k and &#947;, and suppose that (2) holds. Then throughout the entire duration of the algorithm, we have:</p><p>Proof. From (1), Claim 5.13 and Claim 5.15, we infer that:</p><p>Now, since we can set &#968; to be any sufficiently small constant value depending on k and &#947;, and since &#948; 0 &#8804; &#968; according to (1), from (2) we get:</p><p>This concludes the proof.</p><p>Corollary 5.17. If (2) holds, then the algorithm does not return Failure.</p><p>Proof. For contradiction, suppose that the algorithm returns Failure at the end of an iteration t.</p><p>Let i = &#963;(t -1). Since the algorithm returns Failure after iteration t, we must have i = -1. Furthermore, during iteration t, the call to</p><p>Next, observe that C 0 = A 0 . Hence, just before the start of iteration t, we could have formed a matching in G &#8242; by taking the first edge of each path in P * A . Thus, from Corollary 5.16, we get:</p><p>However, both (4) and (5) cannot simultaneously be true. This leads to a contradiction.</p><p>Note that if the template algorithm does not return Failure, then it necessarily returns a matching M out of size |M out | &#8805; |M in | + &#968; k &#8226; n (this holds because of Invariant 5.9, Invariant 5.10 and Observation 5.7). Finally, recall that &#968; k = &#920; k,&#947; (1) as per (1). Lemma 5.12 now follows from Corollary 5.14, Lemma 5.3 and Corollary 5.17.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2">Implementation in Sublinear Models</head><p>In this section, we show how to implement the template algorithm from Section 5.1, when we are allowed access to the input graph G only via adjacency-matrix queries. Throughout this section, we use the following parameters (recall Definition 5.1).</p><p>In Section 5.1.1, the template algorithm starts with iteration t = 1. Here, we use the phrase "iteration t = 0" to refer to the scenario just before the start of the first iteration. Towards this end, for consistency of notations, we define &#491; -1 := 2, M (0) := M in , &#963;(0) :=&#8869;, stack(0) := -1 and &#923; (0) := &#8709;. Further, we define an oracle alive 0 (v) that is supposed to return Yes if v is alive at the end of iteration 0 (i.e., just before the start of iteration 1), and return No otherwise. The rest of this section is organized as follows. Lemma 5.18 shows how to implement each iteration of the template algorithm, under adjacency-matrix query access to the input graph G. Its proof appears at the end of this section. Theorem 5.2 now follows from Lemma 5.12 and Corollary 5.19.</p><p>Lemma 5.18. Suppose that we can access the input graph G only via adjacency-matrix queries, and we have an oracle match M in (.) with &#213;k,&#947; (n 1+&#491;in ) query time. Then we can implement each iteration t &#8805; 0 of the algorithm Augment-Template(G, M in , k, &#947;), as described in Section 5.1, in &#213;k,&#947; (n 2-&#491; t-1 ) time. Furthermore, if the concerned iteration t does not result in the algorithm returning Failure, then we can ensure that we have access to the following data structures at the end of iteration t.</p><p>&#8226; An oracle match M (t) (.) for the matching M (t) , that has a query time of &#213;k,&#947; (n 1+&#491;t ).</p><p>&#8226; An oracle alive t (.) that has a query time of &#213;k,&#947; (n 1+&#491;t ). When queried with a node v &#8712; V , this oracle returns Yes if v is alive at the end of iteration t, and returns No otherwise.</p><p>&#8226; The values &#963;(t) and stack(t), and the contents of the set &#923; (t) .</p><p>Corollary 5.19. Let &#491; out := 9 T &#8226; &#491; in , where T = &#920; k,&#947; (1) is the maximum possible number of iterations of the template algorithm (see Lemma 5.12). Then it takes &#213;k,&#947; (n 2-&#491;in ) time to implement the template algorithm, under adjacency-matrix query access to G. Further, if the template algorithm does not return Failure, then at the end of our implementation we have an oracle match M out (.) for the matching M out returned by it, with query time &#213;k,&#947; (n 1+&#491;out ).</p><p>Proof. By Lemma 5.18, each iteration t of the template algorithm can be implemented in time &#213;k,&#947; (n 2-&#491; t-1 ) = &#213;k,&#947; (n 2-&#491;in ), since &#491; in &#8804; &#491; t-1 . Thus, the total time taken to implement the template algorithm is at most &#213;k,&#947; (T &#8226; n 2-&#491;in ) = &#213;k,&#947; (n 2-&#491;in ).</p><p>Suppose that the template algorithm terminates at the end of iteration t, and returns a matching M out . Then, at the end of iteration t of our sublinear implementation, the situation is as follows.</p><p>,where &#963; &#955; (t) j = 2j for each j &#8712; [0, k] (see Invariant 5.8). The sequence of matchings in &#923; (t) is nested (see Invariant 5.9). Thus, from this sequence of nested matchings we can extract a set of at least M &#955; (t) k many node-disjoint length (2k + 1)-augmenting paths w.r.t. M in in G (see Observation 5.7). The template algorithm obtains the matching M out by applying these augmenting paths to M in . In our sublinear implementation of the template algorithm, however, we can access each matching M &#8712; &#923; (t) only via an oracle match M (.), which has a query time of at most &#213;k,&#947; (n 1+&#491;t ) (see (6) and Lemma 5.18). Furthermore, we can access the matching M in only via the oracle match M in (.), which also has a query time of at most &#213;k,&#947; (n 1+&#491;in ) = &#213;k,&#947; (n 1+&#491;t ).</p><p>We now show how to answer a query to the oracle match M out (v). The key observation is this:</p><p>M (see the discussion on layered subgraphs in Section 5.1.1). Then the graph G * = (V, E * ) consists of a collection of node-disjoint alternating paths w.r.t. M in . We say that a path in G * is complete iff it has one endpoint at layer 0 and the other endpoint at layer (2k + 1). Now, a node v &#8712; V is matched in M out iff: either v &#8712; V (M in ), or v / &#8712; V (M in ) and v is the starting/end point of a complete path in G * .</p><p>Using this observation, we now describe how to answer queries of the form: "Is match M out (v) =&#8869; for a given node v &#8712; V ?". To answer such a query, we apply the procedure below.</p><p>If match M in (v) =&#8869;, then we return that match M out (v) =&#8869;. Else if match M in (v) =&#8869; and &#8467;(v) / &#8712; {0, 2k + 1}, then we return that match M out (v) =&#8869;. Finally, if match M in (v) =&#8869; and w.l.o.g. &#8467;(v) = 0, then we perform the following steps.</p><p>&#8226; v 0 &#8592; v.</p><p>&#8226; For i = 1 to 2k + 1:</p><p>-Else if i is even, then v i &#8592; mate M in (v i-1 ).</p><p>-If v i =&#8869;, then return that match M out (v) =&#8869;.</p><p>&#8226; Return that match M out (v) =&#8869;.</p><p>It is easy to verify that the above procedure correctly returns whether or not match M out (v) =&#8869;. We can extend this procedure in a natural manner, which would also allow us to answer the query match M out (v). To summarize, we can answer a query match M out (v) by making at most one call to each of the oracles match M (.), for M &#8712; &#923; (t) , and at most &#920;(k) calls to the oracle match M in (.). Each of these oracle calls take at most &#213;k,&#947; (n 1+&#491;t ) time, as &#491; t &#8242; &#8804; &#491; t for all t &#8242; &#8712; [1, t]. Since &#923; (t) = k, the oracle match M out (.) has a query time of &#213;k,&#947; (k &#8226; n 1+&#491;t ) = &#213;k,&#947; (n 1+&#491;t ) = &#213;k,&#947; (n 1+&#491;out ), where the last equality holds since &#491; t &#8804; &#491; T = &#491; out . This concludes the proof.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Proof of Lemma 5.18</head><p>We prove the lemma by induction on t.</p><p>Base case (t = 0):</p><p>We already have the oracle match M (0) (.) with query time &#213;k,&#947; (n 1+&#491; 0 ), since &#491; 0 = &#491; in and M (0) = M in . We set &#963;(t) &#8592;&#8869;, stack(t) &#8592; -1 and &#923; (0) &#8592; &#8709;. We now claim that we already have the oracle alive 0 (.). This is because a node v &#8712; V is alive just before the start of iteration 1 if and only if v &#8712; V H . Furthermore, given a query alive 0 (v), we can determine whether v is in V H by checking the value of &#8467;(v), setting u &#8592; mate M (0) (v), and then checking the value of &#8467;(u) if u =&#8869;. Thus, answering a query to the oracle alive 0 (.) takes &#213;k,&#947; (n 1+&#491; 0 ) time. So, we can implement iteration 0 in O(1) time, and Lemma 5.18 holds for t = 0.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Inductive case (t &#8805; 1):</head><p>We assume that Lemma 5.18 holds for all t &#8242; &#8712; [0, t-1], and that we have access to the data structures constructed during all these past iterations. We now focus on implementing the current iteration t (see Section 5.1.1) under adjacency-matrix query access to G. Let i = stack(t -1). If i = k, then the algorithm would terminate after iteration (t -1). Henceforth, we assume that i &#8712; [-1, k -1].</p><p>In the current iteration t, the template algorithm wants to first compute a matching M (t) by calling the subroutine LargeMatching(C 2i+2 &#8746; A 2i+3 , &#948; i+1 ). We first show that we can efficiently query whether or not a given node in V belongs to the set C 2i+2 &#8746; A 2i+3 . Subsequently, we split up our implementation of iteration t into two steps, as described below.</p><p>Claim 5.20. Given any node v &#8712; V , we can determine if v &#8712; C 2i+2 in &#213;k,&#947; (n 1+&#491; t-1 ) time.</p><p>Proof. We first check the value of &#8467;(v) and call alive t-1 (v). Now, we consider the following cases.</p><p>(i) &#8467;(v) = 2i + 2. Here, we return that v / &#8712; C 2i+2 .</p><p>(ii) &#8467;(v) = 2i + 2 and alive t-1 (v) = No. Here, we also return that v / &#8712; C 2i+2 .</p><p>(iii) &#8467;(v) = 2i + 2, alive t-1 (v) = Yes, and i = -1. Here, we return that v &#8712; C 2i+2 .</p><p>(iv) &#8467;(v) = 2i + 2, alive t-1 (v) = Yes and i &#8805; 0. Here, we first set u v &#8592; mate M in (v), which takes &#213;k,&#947; (n 1+&#491;in ) = &#213;k,&#947; (n 1+&#491; t-1 ) time. Next, we call match M &#955; (t-1) i (u v ), which also takes at most &#213;k,&#947; (n 1+&#491; t-1 ) time. Finally, we return that v &#8712; C 2i+2 iff match</p><p>The correctness of the above procedure follows from the definition of the set C 2i+2 . Furthermore, the preceding discussion implies that this procedure overall takes at most &#213;k,&#947; (n 1+&#491; t-1 ) time.</p><p>Corollary 5.21. Given any v &#8712; V , we can determine if v &#8712; C 2i+2 &#8746; A 2i+3 in &#213;k,&#947; (n 1+&#491; t-1 ) time.</p><p>Proof. We can determine if v &#8712; A 2i+3 by checking the value of &#8467;(v) and making a query alive t-1 (v), which takes &#213;k,&#947; (n 1+&#491; t-1 ) time. The corollary now follows from Claim 5.20.</p><p>Step I: Constructing the oracle match M (t) (.). Armed with Corollary 5.21, we mimic the call to LargeMatching(C 2i+2 &#8746; A 2i+3 , &#948; i+1 ) in the template algorithm, by invoking Theorem 4.1 with</p><p>1 and t A = &#213;k,&#947; (n 1+&#491; t-1 ). 10 If Theorem 4.1 returns &#8869;, then we set M (t) := &#8709;, and the trivial oracle match M (t) (.) has O(1) = &#213;k,&#947; (n 1+&#491;t ) query time. Otherwise, Theorem 4.1 returns an oracle match M (.) for a matching M , and we set M (t) := M . By (6) and Theorem 4.1, this oracle match M (t) (.) has query time:</p><p>Finally, from (6) and Theorem 4.1, we infer that overall Step I takes time:</p><p>In the above derivation, the second equality holds since &#491; = 2&#491; t-1 &#8804; 9 T &#491; in &#8804; 1/5 (see (6) and Definition 5.1), whereas the third equality holds since t A = &#213;k,&#947; (n 1+&#491; t-1 ).</p><p>Step II: Determining &#963;(t), stack(t), &#923; (t) , and the oracle alive (t) (.). We set &#963;(t) &#8592; 2i + 2.</p><p>We now fork into one of the following three cases.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Case (a) In</head><p>Step I, the invocation of Theorem 4.1 returned an oracle match M (.) for a matching M , and we set M (t) := M . This will be referred to as a forwarding iteration at layer 2i + 2. In this case, we set &#923; (t) &#8592; &#923; (t-1) &#8746; {t} and stack(t) &#8592; stack(t -1) + 1. Now, we observe that the set of alive nodes does not change during such a forwarding iteration, and so we already have the oracle alive t (.), because alive t (v) = alive (t-1) (v) for all v &#8712; V . Accordingly, the oracle alive t (.) has query time &#213;k,&#947; (n 1+&#491; t-1 ) = &#213;k,&#947; (n 1+&#491;t ).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Case (b): In</head><p>Step I, the invocation of Theorem 4.1 returned &#8869;, and i = -1. Here, the algorithm terminates and returns Failure.</p><p>Case (c): In Step I, the invocation of Theorem 4.1 returned &#8869;, and i &#8805; 0. This will be referred to as a backtracking iteration at layer 2i. In this case, we set &#923; (t) &#8592; &#923; (t-1) \ {&#955; (t-1) i } and stack(t) &#8592; stack(t -1) -1. Now, we observe that due to iteration t, only the nodes in C 2i+2 and their matched neighbors under M in (who are at layer 2i + 1), change their status from alive to dead. The status of every other node remains unchanged. Thus, we can answer a query alive t (v), in &#213;k,&#947; (n 1+&#491;t ) time, as follows.</p><p>We first check the value of &#8467;(v), query alive t-1 (v) and match M in (v), and determine whether or not v &#8712; C 2i+2 by invoking Claim 5.20. Overall, this takes &#213;k,&#947; (n 1+&#491; t-1 ) + &#213;k,&#947; (n 1+&#491;in ) = &#213;k,&#947; (n 1+&#491;t ) time. Next, we consider three cases.</p><p>(i) &#8467;(v) / &#8712; {2i + 1, 2i + 2}. Here, we return alive t (v) = alive (t-1) (v).</p><p>(ii) &#8467;(v) = 2i + 2. Here, if v &#8712; C 2i+2 then we return alive t (v) = No; otherwise we return alive t (v) = alive t-1 (v). (iii) &#8467;(v) = 2i + 1. Here, we set u v &#8592; mate M in (v). Now, if u v &#8712; C 2i+2 then we return alive t (v) = No; otherwise we return alive t (v) = alive t-1 (v).</p><p>To summarize, Step I takes &#213;k,&#947; (n 2-&#491; t-1 ) time, whereas Step II takes only O(k) = O k,&#947; (1) time. Furthermore, at the end of Step II we have all the desired data structures for iteration t, and both the oracles match M (t) (.) and alive t (.) have a query time of at most &#213;(n 1+&#491;t ). Finally, Theorem 4.1 ensures that whp, the way we decide whether we are in case (a), case (b) or case (c) is consistent with the choice made by the template algorithm in the same scenario (see the discussion on "implementing iteration t" in Section 5.1.1, and how the subroutine LargeMatching(S, &#948;) is defined in the second paragraph of Section 5.1). This concludes the proof of Lemma 5.18.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">(1, &#491;n)-Approximate Matching Oracle</head><p>In this section, starting from an empty matching, we repeatedly apply Theorem 5.2 to obtain our main results in the sublinear setting. They are summarized in the theorem and the corollary below, which are restatements of Theorem 1.3. Theorem 6.1. Let &#947;, &#491; &#8242;&#8242; &#8712; (0, 1) be any two small constants, and let G be the input graph with n nodes which we can access via adjacency-matrix queries. Then for a sufficiently small constant &#491; &#8242; &#8712; (0, &#491; &#8242;&#8242; ), there exists an algorithm which: In &#213;&#947; (n 2-&#491; &#8242; ) time, returns an oracle match M (.) for a (1, 3&#947;n)-approximate matching M in G, where the oracle match M (.) has &#213;&#947; (n 1+&#491; &#8242;&#8242; ) query time.</p><p>Corollary 6.2. Given adjacency-matrix query access to an n-node graph G and any constant &#947; &#8712; (0, 1), in &#213;&#947; (n 2-&#491; ) time we can return a (1, 4&#947;n)-approximation to the value of &#181;(G), whp. Here, &#491; &#8712; (0, 1) is a sufficiently small constant depending on &#947;.</p><p>Proof. First, we apply Theorem 6.1, with &#491; = &#491; &#8242; , to get the oracle match M (.) in &#213;&#947; (n 2-&#491; ) time. Note that M is a (1, 3&#947;n)-approximate matching in G = (V, E). Using Chernoff bound, we now compute a (1, &#947;n)-approximate estimate &#956; of |M | by sampling, uniformly at random, a set S of &#213;&#947; (1) nodes from V and querying match M (v) for each node v &#8712; S. This takes &#213;&#947; (n 1+&#491; &#8242;&#8242; ) = &#213;&#947; (n 2-&#491; ) time. The last inequality holds since &#491; = &#491; &#8242; and &#491; &#8242;&#8242; are chosen to be sufficiently small, so that 1 + &#491; &#8242;&#8242; &#8804; 2&#491;. It is now easy to observe that &#956; is a (1, 4&#947;n)-approximation to the value of &#181;(G).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Proof of Theorem 6.1</head><p>Algorithm 4 contains the relevant pseudocode. We slightly abuse the notation in step 2-(b) of Algorithm 4, when we write Z = (M out , &#491; out ). Here, we essentially mean that Augment(G, M in , i, &#947; 2 , &#491; in ) returns the oracle match M out (.) with query time &#213;&#947; (n 1+&#491;out ). Similarly, in step 2-(b), when we write M in &#8592; M out , this means that henceforth we will refer to the oracle match M out (.) as match M in (.).</p><p>The idea behind Algorithm 4 is simple and intuitive. We start by initializing M in &#8592; &#8709;, k &#8592; &#8968;1/&#947;&#8969; and &#491; in &#8592; &#491; &#8242; , where &#491; &#8242; &#8712; (0, 1) is a sufficiently small constant. At this point, we trivially have the oracle match M in (.) with query time &#213;&#947; (n 1+&#491;in ). The algorithm now runs in rounds. In each round, it repeatedly tries to augment the matching M in along small-length augmenting paths, by successively calling Augment(G, M in , i, &#947; 2 , &#491; in ) for i &#8712; [0, k]. Whenever a call to Augment(&#8226;) succeeds, the algorithm feeds its output into the next call to Augment(&#8226;). The algorithm terminates whenever it encounters a round where every call to Augment(&#8226;) returns Failure. Claim 6.3. Algorithm 4 runs for &#213;&#947; (1) rounds, and makes &#213;&#947; (1) calls to Augment(&#8226;).</p><p>Proof. Say that a given round of Algorithm 4 is successful iff during that round: for some i &#8712; [0, k], the call to Augment(G, M in , i, &#947; 2 , &#491; in ) succeeded (see Theorem 5.2 and step 2-(a) of Algorithm 4). By Theorem 5.2, each time a call to Augment(G, M in , i, &#947; 2 , &#491; in ) succeeds, it increases the size of the matching M in (see step 2-(b) of Algorithm 4) by at least &#920; &#947; (1) &#8226; n. Since &#181;(G) &#8804; n, such an event can occur at most &#920; &#947; (1) times. Finally, each round of Algorithm 4 consists of (k + 1) = &#920; &#947; (1) calls to Augment(&#8226;), and all but the last round is successful. This implies the claim.</p><p>Algorithm 4 Near-optimal-matching-oracle (G = (V, E), &#947;). Choose &#491; &#8242; &#8712; (0, 1) to be a sufficiently small constant.</p><p>2. For i = 0 to k:</p><p>Return the oracle match M (&#8226;), which has query time &#213;&#947; (n 1+&#491; &#8242;&#8242; ).</p><p>Claim 6.4. Suppose that at the start of a given round of Algorithm 4, there exists a collection of at least &#947; 2 &#8226; n many node-disjoint length (2i + 1)-augmenting paths w.r.t. M in in G, for some i &#8712; [0, k].</p><p>Then whp, Algorithm 4 does not terminate at the end of the given round.</p><p>Proof. If there exists some j &#8712; [0, i -1] such that the call to Augment(G, M in , j, &#947; 2 , &#491; in ) succeeds during the given round, then it immediately implies the claim (since we would have &#964; = True when the round ends and so the While loop in Algorithm 4 will run for at least one more iteration).</p><p>For the rest of the proof assume that during the given round, for all j &#8712; [0, i -1] the call to Augment(G, M in , j, &#947; 2 , &#491; in ) returns Failure, and hence the matching M in does not change during iterations j = 0 to i -1 of the For loop. Accordingly, at the start of the concerned iteration i of the For loop, the matching M in still admits a collection of at least &#947; 2 &#8226; n many node-disjoint length (2i + 1)-augmenting paths in G. Thus, by Theorem 5.2, the call to Augment(G, M in , i, &#947; 2 , &#491; inp ) succeeds whp. So, it follows that Algorithm 4 does not end after the given round, whp. Corollary 6.5. When Algorithm 4 terminates, whp M in is a (1, 3&#947;n)-approximate matching in G.</p><p>Proof. Let M * be a maximum matching in G. By Claims 6.3 and 6.4, the following holds whp when the algorithm terminates: For all i &#8712; [0, k], there exists at most &#947; 2 &#8226; n many length-(2i + 1) augmenting paths in M in &#8746; M * .</p><p>As k = &#8968;1/&#947;&#8969;, the augmenting paths in M in &#8746; M * that are of length &#8804; 2k + 1 contribute at most (k + 1) &#8226; &#947; 2 n &#8804; 2&#947;n extra edges to M * compared to M in . On the other hand, augmenting paths</p><p>Since Algorithm 4 makes only constantly many calls to Augment(&#8226;), we can choose &#491; &#8242; &gt; 0 to be sufficiently small so as to guarantee that 0 &lt; &#491; &#8242;&#8242; &#8810; 1 (see Claim 6.3 and Theorem 5.2). Further, during the execution of Algorithm 4, each call to Augment(&#8226;) takes &#213;&#947; (n 2-&#491;in ) = &#213;&#947; (n 2-&#491; &#8242; ) time. Theorem 6.1 now follows from Claim 6.3 and Corollary 6.5. matching M of G, which takes O &#491; (m init ) time <ref type="bibr">[DP14]</ref>. Define &#181; * = |M |. Throughout the phase, &#181; * continues to remain a (1 + 2&#491;)-approximate estimate of &#181;(G), and we continue to output the same value &#181; * . This leads to an amortized update time of:</p><p>The last equality holds since |E| = m = &#920;(m init ) throughout the phase. We can ensure that throughout the phase, the algorithm explicitly maintains M , which remains a (1 + 2&#491;)-approximate maximum matching of G. This gives us the matching oracle match M (.), with constant query time.</p><p>Type-II Phase: At the start of a type-II phase, we have &#956; &lt; |E| 0.5+&#491; 0 . Let &#956;init and m init respectively denote the value of &#956; and |E| at the start of the phase. The phase will last for the next &#491;&#956; init updates. Hence, &#181;(G) can change by at most a multiplicative (1 + &#491;) factor during the phase.</p><p>At the start of the phase, for each i &#8712; I, we find a (1, 4&#947;)-approximate estimate &#181; * i of &#181;(G &#966; i ). We obtain &#181; * i by invoking Corollary 6.2 on G &#966; i , with &#947; = &#491; 2 .<ref type="foot">foot_7</ref> This takes time:</p><p>where &#491; * &#8712; (0, 1) is a sufficiently small constant depending on &#491;. Since |I| = &#213;(1), overall we spend &#213;&#491; (&#956; init ) 2-&#491; * time to compute &#181; * i for all i &#8712; I. Next, in |I| = &#213;(1) time, we find an index j &#8712; I which maximizes the value &#181; * j . From Theorem 7.2, it follows that:</p><p>As &#947; = &#491; 2 , we infer that &#181; * j is a purely multiplicative (1 + &#920;(&#491;))-approximate estimate of &#181;(G). We continue to output the same value &#181; * j throughout the phase, since we have already observed that during the phase &#181;(G) changes by at most a multiplicative (1 + &#491;) factor.</p><p>The phase lasts for &#491;&#956; init updates. Accordingly, this leads to an amortized update time of:</p><p>&#213;&#491; (&#956; init ) 2-&#491; * &#491;&#956; init = &#213;&#491; (&#956; init ) 1-&#491; * = &#213;&#491; (m init ) 0.5+&#491; 0 -&#491; * = m 0.5-&#8486;&#491;(1) .</p><p>The last equality holds because we can ensure that &#491; 0 is sufficiently small compared to &#491; * (which, in turn, depends on &#491;), and since |E| = m = &#920;(m init ) throughout the duration of the phase. Because of (7), at the start of the phase we can construct the oracle match M (.) by invoking Theorem 6.1 on the graph G &#966; j * . Over the &#491;&#956; init edge updates of the phase, M continues to remain a (1 + O(&#491;))-approximate maximum matching in G. Finally, to maintain the oracle under edge insertions/deletions during the phase, we simply ignore edge deletions and assume that if a vertex is matched by a deleted edge of M then it is unmatched.</p><p>Improving the update time bound to worst-case: Recall that &#956;init denotes the value of &#956; at the beginning of a phase. As observed after the initialization of a phase the output maintained by the algorithm remains (1 + O(&#491;))-approximate for the next O(&#491; &#8226; &#956;init ) updates. Furthermore, the total computational work done by the algorithm in both types of phases is upper bounded by &#213;(&#956; 2-&#491; * init ) for some small constant &#491; * &#8712; (0, 1) depending on the parameters of the algorithm. Let G i stand for the state of the input graph at the beginning of phase i and let A(G i ) stand for the output of the previously described algorithm initialized on G i . Let &#956;init,i stand for the value of &#956;init at the beginning of phase i. We will now describe the behaviour of the worst-case update time algorithm.</p><p>The improved algorithm similarly initializes it's output to be A(G 1 ). Throughout the first three phases it does not alter it's output and during the first two phases it doesn't complete any background computation. During phase i for i &gt; 2 the algorithm calculates A(G i-2 ) distributing the work evenly throughout the phase. Note that by phase i the algorithm has complete knowledge of G i-2 . At the end of the same phase it switches it's output to be A(G i-2 ).</p><p>As the algorithm only outputs a matching size estimate and an oracle and not an actual matching this switch is done in constant time. Computing A(G i-2 ) takes time proportional to &#213;(&#956; 2-&#491; * init,i-2 ) and is distributed over &#491; &#8226; &#956;init,i updates. As during a phase &#181;(G) may change by at most a 1 + O(&#491;) multiplicative factor we must have that &#956;init,i = &#920;(&#956; init,i-2 ). The amortized implementation amortizes the work of computing A(G i-2 ) over &#491; &#8226; &#956;init,i-2 updates. This implies that the worstcase update time guarantee of the delayed rebuild based algorithm matches the amortized versions update time within a constant factor.</p><p>Basic Properties of Randomized Greedy Matching. Recall that d is the given parameter where d &#8805; d. We set the threshold &#8467; = &#920;(d log(n)/&#491;) such that &#8467; &#8805; &#945; &#8226; 8 &#491; . We have by Markov's inequality that Pr  <ref type="formula">8</ref>). We also say that &#960; &#8712; &#928; is good if f (&#960;) &#8804; &#491;, otherwise we say that it is bad.</p><p>Algorithm 7 TestPerm(&#960;)</p><p>1. Sample r = 1000 log(n)/&#491; vertices independently: v 1 , . . . , v r .</p><p>2. Let X = |i | {T (v i , &#960;) &gt; &#8467;}| and f = X/r.</p><p>3. If f &#8804; 3 4 &#491;, return "yes". Otherwise, return "no".</p><p>A simple procedure in Algorithm 7 accepts a great permutation and rejects a bad permutation with high probability.</p><p>Lemma A.3. If &#960; is great, then TestPerm(&#960;) returns "yes" with high probability. If &#960; is bad, then TestPerm(&#960;) returns "no" with high probability.</p><p>Proof. If &#960; is great but TestPerm(&#960;) returns "no", then we have f (&#960;) &#8804; &#491;/2 but f &gt; 3&#491;/4. Since Preprocessing. The preprocessing algorithm is as follows. First, independently sample O(log n) random edge-permutations &#960; 1 , . . . , &#960; O(log n) . For any i, if TestPerm(&#960; i ) returns "yes", then set &#960; * &#8592; &#960; i .</p><p>Observe that this argument holds regardless of the choice of M * the statement remains true as G undergoes updates even when the updates are made by an adaptive adversary. The contractions &#966; i j are fixed at initialization. The task of the algorithm is to maintain maximum matching size estimate &#956;(G) and hence maintain the accurate guess of &#181;(G) and to update the contracted graphs. Each contracted graph may undergoes a single update per update to G and there are &#213;(1) contracted graphs. All parts included the worst-case update time of the algorithm is &#213;(1). implying that M is always a (1 + O(&#491;))-approximate matching.</p><p>To see why &#181;(G) &#8805; m/2n, consider the process where we repeatedly choose an edge e and delete both endpoints of e from the graph until no edge is left. Since the set of deleted edges forms a matching, we may repeat at most &#181;(G) times. Also, each deletion removes at most 2&#8710; edges from the graph. Therefore, m &#8804; &#181;(G) &#8226; 2&#8710; &#8804; &#181;(G) &#8226; 2n.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>D Tables</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Model Adjacency List</head><p>Adjacency List Adjacency Matrix Guarantee Approx Time Approx Time Approx Time <ref type="bibr">[PR07]</ref> (2, &#491;n) &#8710; O(log(&#8710;/&#491;))</p><p>[NO08]</p><p>(2, &#491;n) 2 O(&#8710;) /&#491; 2 (1, &#491;n)</p><p>(2, &#491;n)</p><p>(2, &#491;n)</p><p>(2, &#491;n)</p><p>(1.5, &#491;n) nd 1-&#8486;(&#491; 2 ) 1.5 + &#491; n&#8710; 1-&#8486;(&#491; 2 ) (1.5, &#491;n) n 2-&#8486;(&#491; 2 )</p><p>[BRR23] bipartite graph only</p><p>(1.5 -&#8486;(1), o(n)) n 2-&#8486;(1) 1.5 -&#8486;(1) n 2-&#8486;(1) (1.5 -&#8486;(1), o(n)) n 2-&#8486;(1)</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Our</head><p>(1, &#491;n) n 2-&#8486;&#491;(1)</p><p>Table 1: Summary of sublinear-time algorithms for estimating the size of maximum matching. We omit polylog(n/&#491;) factors. &#8710; and d denote the maximum and average degree of the graph, respectively.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_0"><p>In<ref type="bibr">[RSW22]</ref>, they use different notation and write 0.99-approximation instead of 1.01.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_1"><p><ref type="bibr">[BRR23]</ref> also announced a &#8486;(n 1.2 )-time lower bound for (3/2 -&#8486;(1), o(n))-approximation.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_2"><p>The dynamic (3/2 -&#8486;(1))-approximate algorithm in<ref type="bibr">[Beh23]</ref> does not have explicit relationship to streaming algorithms. It is obtained using sublinear algorithms to improve the (3/2)-approximation guarantee of the tight instances of EDCS.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_3"><p>Think of a n&#215;n bipartite graph which consists only of a perfect matching. Using o(n 2 ) adjacency-matrix queries, it is not possible to out &#8486;(n) matching edges in this input instance. The lower bound can be extended even if we allow adjacency-list queries by adding &#491;n dummy vertices, each of which connects to every other vertex.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_4"><p>Note that Hoeffding bound Proposition 3.1 is not strong enough here.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_5"><p>In fact, if we define M2 as M i from Line 2g in Algorithm 1, we would even have that |M2| &#8805; 4&#948;inn w.h.p. But we did use this bound just to avoid white-boxing the preprocessing algorithm and make the presentation of the query algorithm more modular.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_6"><p>The reader should keep in mind that in the current section (Section 6), we are using the symbol A to denote the set of alive nodes across all the layers. This is different from the way the symbol A is being used in the statement of Theorem 4.1, where it refers to any arbitrary subset of nodes.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_7"><p>It is trivial to verify that Corollary 6.2 holds even when applied on a multigraph.</p></note>
		</body>
		</text>
</TEI>
