<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Singular Value Approximation and Sparsifying Random Walks on Directed Graphs</title></titleStmt>
			<publicationStmt>
				<publisher>IEEE</publisher>
				<date>11/06/2023</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10488608</idno>
					<idno type="doi">10.1109/FOCS57990.2023.00054</idno>
					
					<author>AmirMahdi Ahmadinejad</author><author>John Peebles</author><author>Edward Pyne</author><author>Aaron Sidford</author><author>Salil Vadhan</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[In this paper, we introduce a new, spectral notion of approximation between directed graphs, which we call singular value (SV) approximation. SV-approximation is stronger than previous notions of spectral approximation considered in the literature, including spectral approximation of Laplacians for undirected graphs [ST04], standard approximation for directed graphs [CKP + 17], and unit-circle (UC) approximation for directed graphs [AKM + 20]. Further, SV approximation enjoys several useful properties not possessed by previous notions of approximation, e.g., it is preserved under products of randomwalk matrices and bounded matrices.We provide a nearly linear-time algorithm for SV-sparsifying (and hence UC-sparsifying) Eulerian directed graphs, as well as `-step random walks on such graphs, for any ` poly(n). Combined with the Eulerian scaling algorithms of [CKK + 18], given an arbitrary (not necessarily Eulerian) directed graph and a set S of vertices, we can approximate the stationary probability mass of the (S, S c ) cut in an `-step random walk to within a multiplicative error of 1/ polylog(n) and an additive error of 1/poly(n) in nearly linear time. As a starting point for these results, we provide a simple black-box reduction from SVsparsifying Eulerian directed graphs to SV-sparsifying undirected graphs; such a directed-to-undirected reduction was not known for previous notions of spectral approximation.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. INTRODUCTION</head><p>Random walks on graphs play a central role in theoretical computer science. In algorithm design, they have found a wide range of applications including, maximum flow [CKM + 11], <ref type="bibr">[LRS13]</ref>, <ref type="bibr">[KLOS14]</ref>, [vdBGJ + 22], [vdBLL + 21], sampling random spanning trees <ref type="bibr">[KM09]</ref>, <ref type="bibr">[MST15]</ref>, and clustering and partitioning <ref type="bibr">[AM85]</ref>, <ref type="bibr">[KVV04]</ref>, <ref type="bibr">[ACL06]</ref>, <ref type="bibr">[OSV12]</ref>. Correspondingly, new algorithmic results on efficiently accessing properties of random walks have the potential for broad implications. In particular, in complexity theory, such algorithms have attracted attention as a promising approach to derandomizing space-bounded computation <ref type="bibr">[SZ99]</ref>, <ref type="bibr">[Rei08]</ref>, <ref type="bibr">[RTV06]</ref>, [AKM + 20].</p><p>In this paper we consider the well-studied problem of estimating the `-step random walk on a directed graph. Given A.A. the work was started prior to joining Amazon and does not relate to Amazon. E.P. is supported by an Akamai Presidential Fellowship. A.S. is supported by a Microsoft Research Faculty Fellowship, NSF CAREER Award CCF-1844855, NSF Grant CCF-1955039, a PayPal research award, and a Sloan Research Fellowship. S.V. is supported by a Simons Investigator Award.</p><p>a strongly connected, weighted, directed graph G = (V, E, w), its associated random walk matrix W 2 R V &#8677;V , and an integer `&gt; 0, we seek to approximate key properties of the `-step random walk, W `, more efficiently than we could computing W `explicitly. For example, we may wish to estimate individual entries of W `, the conductance or cut probabilities of subsets of vertices, or (expected) hitting times between pairs of vertices.</p><p>In recent years, graph sparsification has emerged as a powerful approach for efficiently solving such problems. When the graph is undirected, we look for spectral sparsifiers of the Laplacian L = D A, where D is the diagonal matrix of degrees and A is the adjacency matrix. It is known that for all &#9999; 2 (0, 1), that there exist &#9999;-spectral sparsifiers with sparsity &#213;(|V |&#9999; 2 ); that is, a Laplacian matrix L with &#213;(|V |&#9999; 2 ) nonzero entries such that</p><p>Spectral sparsifiers can be computed in nearly linear time <ref type="bibr">[ST04]</ref>, <ref type="bibr">[SS08]</ref>, <ref type="bibr">[BSS12]</ref>, <ref type="bibr">[PS14]</ref>. Normalizing such a sparsifier L by D 1/2 on both sides, we obtain a spectral approximation of the normalized Laplacian D 1/2 LD 1/2 , which directly gives information about random walks because it is equivalent (up to a change of basis) to the randomwalk Laplacian, LD 1 = I W. Indeed, from any spectral sparsifier L satisfying Equation (1), we can approximate any desired cut (S, S c ) in the original graph in nearly linear time by evaluating x &gt; e Lx for x equal to the indicator vector of S. Furthermore, there are nearly linear-time algorithms for computing sparse &#9999;-spectral sparsifiers corresponding to the `step random walk, i.e., sparsifiers of the weighted graph whose random-walk Laplacian is I W `, for any polynomial length `[CCL + 15], <ref type="bibr">[MRSV21]</ref>.</p><p>Obtaining analogous results for sparsifying I W `for directed graphs has been more challenging. For a directed graph, we consider the directed Laplacian [CKP + 16] L = D out A &gt; where D out is the associated diagonal matrix of out-degrees and A &gt; is the transpose of the associated weighted adjacency matrix, A. In comparison to their symmetric counterparts for undirected graphs, nearly linear-time sparsification algorithms (which approximate more than the associated undirected graph) were developedl more recently [CKP + 17], [CGP + 18] and have yet to be extended to handle long random walks.</p><p>Here we describe challenges in sparsifying I W `for directed graphs.</p><p>a) Unknown Stationary Distribution.: While the kernel of an undirected Laplacian matrix is the all ones vector, computing the kernel of a directed Laplacian matrix L corresponds to computing the stationary distribution &#8673; of the random walk on the directed graph (LD 1 out &#8673; = 0). Without explicitly knowing the kernel, it is not known how to efficiently perform any kind of useful sparsification or approximately solve linear systems in L. This difficulty was overcome in [CKP + 16], <ref type="bibr">[AJSS19]</ref> which provide reductions from solving general directed Laplacian systems to the case where the graph is Eulerian, meaning that every vertex has the same in-degree as out-degree. In Eulerian graphs, the stationary distribution is simply proportional to the vertex degrees and the all ones vector is both the left and right kernels of the associated directed Laplacian.</p><p>b) Defining Approximation.: Undirected Laplacians L are symmetric and positive semidefinite (PSD), i.e., x &gt; Lx 0 for all x. This leads to the natural Spielman-Teng <ref type="bibr">[ST04]</ref> definition of multiplicative approximation given in Equation (1). That is, we say that e L is an "-approximation of</p><p>where is the L&#246;wner order on PSD matrices. However, even though Laplacians of directed graphs are potentially asymmetric, the quadratic form x &gt; Lx depends only on a symmetrization of the Laplacian (x &gt; Lx = x &gt; ((L + L &gt; )/2)x). Consequently, the quadratic form discards key information about the associated directed graph (e.g. the quadratic form of a directed cycle and an undirected cycle are the same). Thus, defining approximation for directed graphs (even Eulerian ones) is more challenging than for undirected graphs and a more complex notion of approximation was introduced in [CKP + 16]. This additional complexity requires designing new sparsification algorithms that take into account the directedness of the graph. c) Preservation under Powering.: Even for undirected graphs, the standard definition of spectral approximation in Equation (1) is not preserved under powering. That is, I f W &#8673; I W does not imply that I f W 2 &#8673; I W 2 . Indeed, in graphs that are bipartite (and connected), I W 2 has a two-dimensional kernel, corresponding to the &#177;1 eigenvalues of W, whereas I W has only a one-dimensional kernel. Standard spectral approximation requires perfect preservation of the kernel of I W, but not of I W 2 . Graphs that are nearly bipartite (i.e., where W has an eigenvalue near 1) can also experience a large loss in quality of approximation when squaring.</p><p>Cheng et al. [CCL + 15] addressed this issue by (implicitly) strengthening spectral approximation to require that</p><p>This notion of approximation enabled algorithms for sparsifying I W `for undirected graphs in randomized near-linear time [CCL + 15], <ref type="bibr">[MRSV21]</ref> and deterministic logspace <ref type="bibr">[MRSV21]</ref>, <ref type="bibr">[DMVZ20]</ref>. For directed graphs, the problem comes not just from bipartiteness, but general periodic structures (e.g. a directed cycle), which give W complex eigenvalues on or near the unit circle. This led Ahmadenijad et al. [AKM + 20] to propose the notion of unit-circle (UC) approximation, which amounts to requiring that I z f W approximate I zW for all complex numbers z of magnitude 1, with respect to the standard notion of approximation for directed graphs proposed in [CKP + 16]. UC approximation has the property that it is preserved under taking arbitrary powers, with no loss in the approximation error. As such, sparsification techniques for UC and stronger notions must exactly preserve periodicity.</p><p>d) Preservation of Periodic Structures.: Sparsifying directed graphs under UC approximation is more challenging due to the need to preserve periodic structures in the graph, which can be easily lost or introduced by common sparsification techniques such as random sampling <ref type="bibr">[SS08]</ref> or patching to fix degrees. Thus in [AKM + 20], it was only shown how to partially sparsify the square of a graph; that is obtain a graph with random-walk matrix g W 2 such that I g W 2 UCapproximates I W 2 , but has fewer edges than the true square W 2 . Still, the number of edges in g W 2 is larger than in W by at least a constant factor, so if we iterate to obtain a sparse approximation of W `, the number of edges will grow by a factor of c log `= poly(`) and our approximations will quickly become dense. This was affordable in the deterministic logspace algorithms of [AKM + 20], but is not in our setting of nearly linear time.</p><p>Our Work: In this paper we provide several tools for overcoming these challenges, advancing both algorithmic and structural tools regarding graphs sparsification. First, we introduce a new notion of directed graph approximation called singular value (SV) approximation. We then show that that this notion of approximation strictly strengthens unit-circle approximation and show that it has a number of desirable properties, such as preservation under not only powers but arbitrary products of random-walk matrices, and implying approximation of stationary probabilities of all cuts. Then we provide an efficient near linear-time randomized algorithm for computing nearly linear-sized SV-sparsifiers for arbitrary Eulerian directed graphs; this implies the first proof that nearly linear-sized UC-sparsifiers exist for Eulerian directed graphs. As a starting point for this result, we provide a simple reduction from SV-sparsifying Eulerian directed graphs to SVsparsifying undirected graphs; no such reduction was known for the previous, weaker forms of spectral approximation of directed graphs, and shows that SV approximation is a significant strengthening even for undirected graphs.</p><p>Combined with the Eulerian scaling algorithms of [CKK + 18], we obtain an algorithm for approximating the stationary probabilities of cuts (as well as "uncuts") in random walks on arbitrary directed graphs, which we define as follows:</p><p>Definition I.1 (Cut values). For a strongly connected, weighted digraph G on n vertices, let &#181; be the unique stationary distribution of the random walk on G, and let &#181; edge be the stationary distribution on edges (i, j) of G (i.e., pick i according to &#181; and j by following an edge from i proportional to its weight). For subsets S and T of vertices, define:</p><p>If G has random-walk matrix W, we may write Cut W and Uncut W instead of Cut G and Uncut G . Definition I.2 (Powering). For a weighted digraph G with adjacency matrix A, out-degree matrix D out , and randomwalk matrix W = AD 1 out , we write G `for the weighted digraph with adjacency matrix (AD 1 out ) `&#8226; D out (and thus outdegree matrix D out and random-walk matrix W `).</p><p>With these definitions, the main application of our SV sparsification results is the following:</p><p>There is a randomized algorithm that, given a strongly connected n-node m-edge directed graph G with integer edge weights in [1, U], a walk length `, an error parameter " &gt; 0, and lower bound s on the minimum stationary probability of the random walk on G, runs in time O((m + n" 2 ) &#8226; poly(log(U `/s)) and outputs an O(n" 2 &#8226; poly(log(U `/s)))-edge graph H such that for every two sets S, T of vertices, we have:</p><p>In particular:</p><p>and</p><p>Note that when U, `&#63743; poly(n), s 1/poly(n), and " 1/poly(log n), our algorithm runs in time &#213;(m). For comparison, note that, given a set S, we can estimate the cut value for S using random walks in time roughly &#213;(`/(" 2 Cut G (S))), which is slower when `/Cut G (S) is m 1+&#8998;(1) . (Note that Cut G (S) can be as small as 1/m.)</p><p>It is also worth comparing to the following approaches that yield high-precision estimates (i.e. replacing multiplicative error " with polynomially small additive error):</p><p>&#8226; Use matrix powering via repeated squaring to compute W `. This takes time n ! &#8226; log(`), where ! is the matrix multiplication exponent. This is slower than our algorithm assuming ! &gt; 2 or m &#63743; n 2 &#8998;(1) .</p><p>&#8226; Use the algorithm of [CKK + 18] to obtain a highprecision estimate of the stationary distribution &#181; of G in time &#213;(m), and then use repeated matrix-vector multiplication to compute W `&#181;. This takes time &#213;(m`), so is slower than our algorithm except when `= polylog(n). This also gives high-precision estimates of W `, and does so in nearly logarithmic space, but the running time is superpolynomial. A running time of &#8998;(m &#8226; `) seems inherent in the approach as it works by reducing to solving a directed Laplacian system of size m &#8226; `.</p><p>It remains an interesting open problem to estimate any desired entry of W `to high precision in nearly linear time. e) Other Work on SV Approximation.: The definition of SV approximation and some of our results on it (obtained in collaboration between Jack Murtagh and the authors) were first presented in the first author's PhD thesis <ref type="bibr">[Ahm20]</ref> in August 2020. Independently, Kelley <ref type="bibr">[Kel21]</ref> used a variant of SV approximation to present an alternative proof of a result of <ref type="bibr">[HPV21]</ref>, who used unit-circle approximation to prove that the Impagliazzo-Nisan-Wigderson pseudorandom generator <ref type="bibr">[INW94]</ref> fools permutation branching programs. Golowich and Vadhan <ref type="bibr">[GV22]</ref> used SV approximation and some of our results (presented in the aforementioned thesis) to prove new pseudorandomness properties of expander walks against permutation branching programs. Most recently, Chen, Lyu, Tal, and Wu <ref type="bibr">[CLTW22]</ref> have used a form of SV approximation to present alternative proofs of the results of [AKM + 20], <ref type="bibr">[PV21]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Singular-Value Approximation</head><p>In this paper, we present a stronger and more robust notion for addressing the challenge of defining approximation between directed graphs. Specifically, we introduce a novel definition of approximation for asymmetric matrices, which we call singular-value approximation (or SV approximation).</p><p>For simplicity in the rest of this introduction, we focus on the case of regular directed graphs, i.e. directed graphs where for some value d 0, every vertex has in-degree d and out-degree d. (In the case of digraphs with non-negative edge weights, we obtain the in-and out-degrees by summing the incoming or out-going edge weights at each vertex.) However, all of our results generalize to Eulerian digraphs and some generalize to wider classes of complex matrices.</p><p>To introduce SV approximation, let A be the adjacency matrix of a d-regular digraph, i.e., A is a non-negative real matrix where every row and column sum equals d. Then the (in-and out-) degree matrix is simply dI. Dividing by d, it is equivalent to study approximation of the random-walk matrix W = A/d, which is doubly stochastic, and has degree matrix I.</p><p>Definition I.4 (SV approximation for doubly stochastic matrices). For doubly stochastic matrices W, f W 2 R n&#8677;n we say that f W is an "-singular-value (SV) approximation of W, written f W sv &#8673; " W, if for all "test vectors" x, y 2 R n , we have</p><p>This formulation of SV-approximation is one of several equivalent formulations we provide in the full paper. We can equivalently define SV approximation between doubly stochastic matrices by requiring Equation (3) to hold for all complex test vectors x, y 2 C n . SV-approximation can also be defined equivalently by replacing condition (3) with</p><p>These two formulations, (3) and (5), differ only in using the geometric mean or the arithmetic mean of the terms involving x and y on the right-hand side. The formulation in terms of the geometric mean implies the one in terms of the arithmetic mean (since the geometric mean is no larger than the arithmetic mean); the converse follows by optimizing over scalar multiples of x and y (as was done in e.g. [CKP + 17], [AKM + 20]). Both formulations can be rewritten more simply by noting that x &gt; (I WW &gt; )x = kxk 2 kx &gt; Wk 2 and y &gt; (I W &gt; W)y = kyk 2 kWyk 2 , but the description in terms of quadratic forms will be more convenient for comparison with previous notions of approximation.</p><p>In the full paper, we provide more general definitions of SV approximation which also apply to unnormalized directed Laplacians and even to complex matrices. We prove that SV approximation is strictly stronger than previous notions of spectral approximation considered in the literature, even for undirected graphs, and enjoys several useful properties not possessed by the previous notions. Most notably, there is a simple black-box reduction from SV-sparsifying Eulerian directed graphs to SV-sparsifying undirected graphs; no such reduction is known for prior notions of asymmetric spectral approximation.</p><p>Furthermore, we give efficient algorithms for working with SV approximation. These include nearly linear-time algorithms for SV-sparsifying undirected and hence also Eulerian directed graphs (Theorem I.11), as well as random-walk polynomials of directed graphs (Theorem I.11). We also show that a simple repeated-squaring and sparsification algorithm for solving Laplacian systems also works for Eulerian digraphs whose random-walk matrix is normal (i.e., unitarily diagonalizable), if we use SV-sparsification at each step (Theorem I.12). Prior Laplacian solvers for Eulerian graphs are more complex. We elaborate on these results in the next several subsections.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Comparison to Previous Notions of Approximation</head><p>Let us compare Theorem I.4 to previous definitions of approximation.</p><p>a) Undirected spectral approximation.: Let's start with the undirected case, where W = W &gt; . In this case, it can be shown that we can without loss of generality restrict Theorem I.4 to x = y, obtaining the following: f</p><p>In contrast, the standard definition of spectral approximation (introduced by Spielman and Teng <ref type="bibr">[ST04]</ref>), which we denote by f W &#8673; " W, is equivalent to requiring that for all x 2 R n , we have</p><p>To compare inequalities (6) and (7), we write x = P i c i v i , where v 1 , . . . , v n is an orthonormal eigenbasis for W with associated eigenvalues 1 , . . . , n . Since W is stochastic,</p><p>whereas the right-hand side of ST inequality (7) becomes</p><p>Since each | i | &#63743; 1, the fact that SV approximation implies ST approximation then follows from</p><p>(1</p><p>However, we also see that inequality (6) can be much stronger than inequality (7) when W has eigenvalues i close to -1 (e.g. in a bipartite graph with poor expansion) because then 1 2 i is close to 0, but 1 i is bigger than 2. More generally, inequality (6) requires that f W approximates W very well on every test vector x that is concentrated on the eigenvectors whose eigenvalues have magnitude close to 1, whereas inequality (7) only requires close approximation on the (signed) eigenvalues that are close to 1.</p><p>We remark that another way of ensuring f W preserves unit singular values is to replace W 2 in the SV inequality (6) with the matrix |W| where we replace all eigenvalues of W with their absolute value rather than their square, 1 so that we have:</p><p>Using |W| instead of W 2 results in an equivalent definition up to a factor of 2 in ", and W 2 turns out to be convenient to work with. 2 This viewpoint also explains why we stop at W 2 in the definition and don't explicitly use higher powers; it is simply a convenient proxy for |W|, which captures all powers. Indeed, for all k 2 N</p><p>b) Directed spectral approximation.: Turning to previous notions of spectral approximation for directed graphs, standard approximation [CKP + 17] generalizes the definition of Spielman and Teng <ref type="bibr">[ST04]</ref> </p><p>1 Another way of describing |W| is as the psd square root of the psd matrix W 2 . 2 |W| = (W 2 ) 1/2 , i.e., |W| is the PSD square root of W 2 . In Definition I.4, we could similarly replace WW &gt; and W &gt; W with their PSD square roots and obtain a definition that is equivalent up to a factor of 2 in ".</p><p>Equivalently, we can require that for all x, y 2 R n ,</p><p>It can be shown that for undirected graphs, standard approximation is equivalent to the condition of Equation <ref type="formula">7</ref>, so we will also refer to it as standard approximation. The use of different left and right test vectors x and y on the left-hand side is crucial for capturing the asymmetric information in f W and W. As before, if x or y is concentrated on eigenvectors of W whose eigenvalues are close to 1, then the right-hand side of ST inequality (9) is close to 0 and f W must approximate W very well. However, like the standard undirected ST inequality (7), not much is required on eigenvalues near -1. Moreover, asymmetric matrices can have eigenvalues that are not real and are equal to or close to complex numbers of magnitude 1. For example, the eigenvalues of a directed n-cycle are the complex n'th roots of unity.</p><p>To address this issue, unit-circle (UC) approximation [AKM + 20], written f W &#8673; " W, requires that for all complex test vectors x, y 2 C n , we have</p><p>That is, we take the complex magnitude of the terms involving W on the right-hand side of ST inequality (8). That way, if x and y are concentrated on eigenvectors of W that have eigenvalue near some complex number &#181; of magnitude 1, we require that f W approximates W very well. For example, consider the case where W is normal, i.e., has an orthonormal basis of complex eigenvectors v 1 , . . . , v n and with corresponding complex eigenvalues 1 , . . . , n . Then if we write x = P i c i v i and y = P i d i v i , the right-hand side of UC inequality (10) becomes:</p><p>If x and y are concentrated on eigenvalues i &#8673; &#181; where |&#181;| = 1, then this expression will be close to 0. Unit-circle approximation has valuable properties not enjoyed by standard approximation, in particular being preserved under powering: If f W is an "-UC approximation of W, then for every positive integer k, f W k is an O(")-UC approximation of W k ; note that the quality of approximation does not degrade with k. This property was crucial for the results of [AKM + 20].</p><p>However, UC approximation has two limitations compared to SV approximation. First, UC expression (11) is only small if x and/or y is concentrated on eigenvalues that are all close to the same point &#181; on the complex unit circle. Even in the undirected case, if x and y are mixtures of eigenvectors with eigenvalue close to 1 and eigenvalue close to -1, then there will be cancellations in the second term of UC expression (11) and the result will not be small. Second, some properties of asymmetric matrices are more directly captured by singular values than eigenvalues, since singular values treat the domain and codomain as distinct. For example, the second-largest singular value of W equals 1 if and only if there is a probability distribution &#8673; on vertices that does not mix at all in one step (i.e., kW&#8673; uk = k&#8673; uk, where u = 1/n and &#8673; 6 = u), but the latter can hold even when all nontrivial eigenvalues have magnitude strictly smaller than 1.</p><p>To see how SV approximation addresses these limitations, let 1 , . . . , n 0 be the singular values of W, let u 1 , u 2 , . . . , u n 2 C n the corresponding left-singular vectors of W, and let v 1 , . . . , v n 2 C n the corresponding right-singular vectors. If we write x = P i c i u i and y = P i d i v i , then the right-hand side of SV inequality (3) becomes:</p><p>Consequently, SV-approximation requires high-quality approximation if x is concentrated on left-singular vectors of singular value close to 1 and/or y is concentrated on right-singular vectors of singular value close to 1. (For the "or" interpretation, use the formulation of SV approximation in terms of inequality (5).) To compare with UC expression (11), let us consider what happens with a normal matrix, where</p><p>In this case, SV expression (12) amounts to bringing the absolute value of UC expression (11) inside the summation (and squaring, which only makes a factor of 2 difference), to avoid cancellations between eigenvalues of different phases. Furthermore, for non-normal matrices, SV approximation retains the asymmetry of W even on the right-hand side, by always using x on the left of W (thus relating to its decomposition into left singular vectors) and y on the right of W (thus relating to its decomposition into right singular vectors). Indeed, this feature allows us to even extend the definition of SV approximation to non-square matrices.</p><p>Following the above intuitions, we prove that SV approximation is indeed strictly stronger than the previous notions of approximation, even for undirected graphs: Theorem I.5. For all doubly stochastic matrices W and f W,</p><p>On the other hand, for every n 2 N there exist random walk matrices f W, W for n-node undirected graphs such that f</p><p>Since UC approximation implies standard approximation, we likewise separate SV from standard approximation. Finally, we note that our separation implies that several useful properties enjoyed by SV approximation, such as preservation under products, are not satisfied by UC approximation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Properties of SV Approximation</head><p>SV approximation enjoys a number of novel properties not known to be possessed by previous notions of spectral approximation. Most striking is the fact that directed approximation reduces to undirected approximation. To formulate this, we define the symmetric lift of a matrix: Definition I.6. Given W 2 C m&#8677;n , let the symmetric lift of W be defined as</p><p>Graph theoretically, the symmetric lift of W is the following standard operation: Given our directed graph G on n vertices with random-walk matrix W, we lift G to an undirected bipartite graph H with n vertices on each side, where we connect left-vertex i to right-vertex j if there is a directed edge from i to j in G. Then slift (W) is the random-walk matrix of H.</p><p>Theorem I.7. Let W and f W be doubly stochastic matrices. Then</p><p>Thus, for the first time (as far as we know), sparsification of directed graphs reduces directly to sparsification of undirected graphs. It would be very interesting to obtain a similar reduction for other algorithmic problems in spectral graph theory, such as solving Laplacian systems.</p><p>Another novel property of SV approximation is that it is preserved under products:</p><p>Notably the approximation error does not grow with the number k of matrices being multiplied. This property does not hold for UC approximation, only the weaker property of preservation under powering, i.e.,</p><p>In addition, SV approximation is preserved under multiplication on the left and right by arbitrary matrices of bounded spectral norm. Indeed, it can be seen as the "closure" of standard approximation under this operation (up to a factor of 2).</p><p>Theorem I.9. The following hold for all doubly stochastic matrices W and f W:</p><p>W then for all complex matrices U and V of spectral norm at most 1, we have U f WV sv &#8673; " UWV, and hence U f WV &#8673; " UWV. 2) If for all complex matrices U and V of spectral norm at most 1, we have</p><p>Since UWV and U f WV need not be doubly stochastic matrices, Theorem I.9 uses the generalization of SV approximation to more general matrices, which can be found in the full version of the paper.</p><p>Recall that standard spectral sparsifiers <ref type="bibr">[ST04]</ref> are also cut sparsifiers <ref type="bibr">[BK00]</ref>. That is, if e G is an "-approximation of G, then for every set S of vertices, the weight of the cut S in e G is within a (1 &#177; ") factor of the weight of S in G. Indeed, if we take the test vector x to be the characteristic vector of the set S in inequality (7), we obtain</p><p>where Cut(&#8226;) is as in Theorem I.1. Similarly, we can obtain a combinatorial consequence of SV approximation, by taking x to be a characteristic vector of a set S of vertices and taking y to be a characteristic vector of a set T of vertices. This yields: Proposition I.10. Let f W and W be doubly stochastic n &#8677; n matrices and suppose that f W sv &#8673; " W. Then for every two subsets S, T &#10003; [n], we have</p><p>Note that W &gt; W (resp., WW &gt; ) is the transition matrix for the forward-backward walk (resp. backward-forward walk), namely where we take one step using a forward edge of the graph followed by one step using a backward edge.</p><p>Let us interpret Proposition I.10. First, consider the case that W = J, the matrix with every entry equal to 1/n (the random-walk matrix for the complete graph with self-loops). Then the distribution &#181; edge on pairs (i, j) in the definition of Cut W (Theorem I.1) has i and j as uniform and independent vertices, and the same is true for Cut WW &gt; and Cut W &gt; W . Thus, Proposition I.10 says:</p><p>where &#181;(S) = |S|/n and &#181;(T ) = |T |/n are the stationary probabilities of S and T , respectively. This amounts to a restatement of the Expander Mixing Lemma (cf., <ref type="bibr">[Vad12,</ref><ref type="bibr">Lemma 4</ref>.15]); indeed f W sv &#8673; " J if and only if f W is a spectral expander with all nontrivial singular values at most "/2.</p><p>Next, let's consider the case that T = S c . Since Cut WW &gt; (S) = Cut WW &gt; (S c ), SV approximation implies that:</p><p>We claim that ( <ref type="formula">14</ref>) is stronger than the standard notion of a cut approximator (13). Indeed, it can be shown that</p><p>and similarly for Cut W &gt; W (T ). The reason is that if a backward-forward walk crosses between S and S c , then it must cross between S and S c in either the first step or in the second step. Similar reasoning shows that</p><p>and similarly for Cut W &gt; W (T ). Thus SV approximation also implies:</p><p>Thus, we conclude that an SV-approximator not only approximates every cut to within a small additive error that is scaled by the weight of the cut edges (as in (13)), but also scaled by the weight of the uncut edges.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>D. Algorithmic Results</head><p>Even though SV approximation is stronger than previously considered notions of spectral approximation, we show that it still admits sparsification: Theorem I.11. There is a randomized nearly-linear time algorithm that given a regular directed graph G with n vertices and m edges, integer edge weights in [0, U], and random-walk matrix W, and " &gt; 0, whp outputs a weighted graph e G with at most O(n" 2 &#8226; poly(log(nU ))) edges such that its random-walk matrix f W satisfies f W sv &#8673; " W.</p><p>A more general theorem that also applies to Eulerian digraphs is stated in the full version of the paper. Prior to this work, it was open whether or not even UC-sparsifiers with O(n &#8226; poly(log n, 1/")) edges existed for all unweighted regular digraphs. Instead, it was only known how to UCsparsify powers of a random walk matrix in such a way that the number of edges increases by at most a polylogarithmic factor compared to the original graph (rather than decrease the number of edges) [AKM + 20].</p><p>By Theorem I.7, it suffices to prove Theorem I.11 for undirected bipartite graphs. We obtain the latter via an undirected sparsification algorithms based on expander partitioning <ref type="bibr">[ST04]</ref>. It remains an open question whether algorithms based on edge sampling can yield SV approximation unit circle approximation, even in undirected graphs. The standard approach to spectral sparsification of undirected graphs via sampling, namely keeping each edge independently with probability proportional to its effective resistance <ref type="bibr">[SS08]</ref>, does not work for SV or UC approximation. For example, this method does not exactly preserve degrees, which we show is necessary for SV sparsification. <ref type="foot">3</ref>However, we remark that the work of Chu, Gao, Peng, Sawlani, and Wang [CGP + 18] does yield something closer to sparsification via degree preserving sampling for standard approximation [CKP + 17] but not unit circle approximation. They show that if one has a directed graph and decomposes it into short "cycles" without regard for the direction of the edges on the cycle, then one can sparsify by randomly eliminating either the clockwise or counterclockwise edges on each such cycle. We build on their procedure and use it to obtain SV sparsification (and hence, unit circle) by showing that this technique obtains SV approximation, even if the cycles are not short, as long as (a) all the cycles are within expanding subgraphs, and (b) the cycles alternate between forward and backward edges. (Note that such alternating cycles in a di-rected graph correspond to ordinary cycles in the undirected lift given by Theorem I.7.) Given Theorem I.11, we obtain our algorithm for longer walks (Theorem I.3) as follows:</p><p>1) First, we show that we can SV-sparsify the squares of random-walk matrices of Eulerian digraphs; we follow the approach of [CKK + 18] by locally sparsifying the bipartite complete graphs that form around each vertex when squaring, and then applying Theorem I.11 to globally sparisfy further. We likewise show the "derandomized square" approach used in <ref type="bibr">[RV05]</ref>, <ref type="bibr">[PS14]</ref>, <ref type="bibr">[MRSV21]</ref>, [AKM + 20] gives a square sparsifier. 2) Then we SV-sparsify arbitrary powers of 2 by repeatedly squaring and sparsifying, using the fact that SV approximation is preserved under powering. During this process, we need to ensure that the ratio between the largest and smallest edge weights remains bounded. We do this by restricting to graphs that have second-largest singular value bounded away from 1 by 1/poly(nU `), which allows us to discard edge weights that get too small and make small patches to preserve degrees. We can achieve this assumption on the second-largest singular value by adding a small amount of laziness to our initial graph. 3) Then to sparsify arbitrary powers W `, we can multiply sparsifiers for the powers of 2 appearing in the binary representation of `. For example, to get a sparsifier for W 7 , we multiply sparsifiers for W 4 , W 2 , and W 1 , sparsifying and eliminating small edge weights again in each product. The use of SV approximation plays an important role in the analysis of this algorithm, because it has the property that the product of the approximations of the powers still approximates the product of the true powers (Theorem I.8). 4) Given Theorem I.11, we obtain Theorem I.3 for general directed graphs by using [CKK + 18] to compute a highprecision estimate of the stationary distribution, which allows us to construct an Eulerian graph whose randomwalk matrix closely approximates that of the original graph. SV-sparsifying the `'th power of the Eulerian graph gives us a graph all of whose Cut and Uncut values approximate the `'th power of our input graph. The use of [CKK + 18] to estimate the stationary distribution and the introduction of laziness to W both incur a small additive error , but we can absorb that into " by setting = 1/poly(nU/s) and observing that Cut G `(S ) and Uncut G `(S ) are at least 1/poly(nU/s) (if nonzero).</p><p>Our final contribution concerns algorithms for solving directed Laplacian systems. The recursive identities used for solving undirected Laplacian systems, while behaving nicely with respect to PSD approximation, do not behave as nicely with respect to the previous approximation definitions for directed graphs. This led to different, more sophisticated recursions with a more involved analysis of the error [CKP + 17], [CKK + 18], [AKM + 20], <ref type="bibr">[KMG22]</ref>. We make progress towards simplifying the recursion and analysis of solving di-rected Laplacian linear systems in the following way. We show that a simpler recursion, a variant of the one used by Peng and Spielman <ref type="bibr">[PS14]</ref> (Equation (17) below), and a simpler analysis suffice if the directed Laplacian is normal (i.e., unitarily diagonalizable) and we perform all sparsification with respect to SV approximation. Note that this result is the only result in our paper that relies on a normality assumption; the aforementioned sparsification results hold for all Eulerian directed graphs.</p><p>Theorem I.12. For a doubly stochastic normal matrix W 2 R n&#8677;n with kWk &#63743; 1, let W = W 0 , . . . , W k 1 be a sequence of matrices such that for " &#63743; 1/4k we have</p><p>and  where B = ((I W) 1/2 ) &#8676; (I W) 1/2 . Theorem I.12 says that that we can compute a good preconditioner P 0 for the Laplacian I W by repeatedly computing SV-approximate squares (eq. 16) and use the simple recurrence (eq. 17, starting with a preconditioner P k for a sufficiently large power of W. Generally, P k is easy to obtain for a large enough k = O(log n) since W 2 k is well-approximated by a complete graph (assuming the original graph is connected and aperiodic).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>E. Open Problems</head><p>One open problem is to determine whether or not it is possible to obtain linear-sized sparsifiers. Recall that undirected graphs have sparsifiers with respect to standard spectral approximation that have only O(n/" 2 ) nonzero edge weights <ref type="bibr">[BSS12]</ref>. If this result could be extended to obtain linear-sized SV-sparsifiers of undirected graphs, we would also have linear-sized sparsifiers for directed graphs by Theorem I.7, which would be a new result even for standard approximation [CKP + 17].</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_0"><p>Unlike standard spectral approximation, degrees cannot be fixed just by adding self loops; indeed, self-loops ruin bipartiteness and periodicity, which are properties that UC and SV approximation retain (as they are captured by eigenvalues like -1 or other roots of unity).</p></note>
		</body>
		</text>
</TEI>
