<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Efficient solution of large-scale algebraic Riccati equations associated with index-2 DAEs via the inexact low-rank Newton-ADI method</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>06/01/2020</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10292953</idno>
					<idno type="doi">10.1016/j.apnum.2019.11.016</idno>
					<title level='j'>Applied Numerical Mathematics</title>
<idno>0168-9274</idno>
<biblScope unit="volume">152</biblScope>
<biblScope unit="issue">C</biblScope>					

					<author>Peter Benner</author><author>Matthias Heinkenschloss</author><author>Jens Saak</author><author>Heiko K. Weichelt</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[This paper extends the algorithm of Benner et al. (2016)   [10] to Riccati equations associated with Hessenberg index-2 Differential Algebratic Equation (DAE) systems. Such DAE systems arise, e.g., from semi-discretized, linearized (around steady state) Navier-Stokes equations. The solution of the associated Riccati equation is important, e.g., to compute feedback laws that stabilize the Navier-Stokes equations. Challenges in the numerical solution of the Riccati equation arise from the large-scale of the underlying systems and the algebraic constraint in the DAE system. These challenges are met by a careful extension of the inexact low-rank Newton-ADI method to the case of DAE systems. A main ingredient in the extension to the DAE case is the projection onto the manifold described by the algebraic constraints. In the algorithm, the equations are never explicitly projected, but the projection is only applied as needed. Numerical experience indicates that the algorithmic choices for the control of inexactness and line-search can help avoid subproblems with matrices that are only marginally stable. The performance of the algorithm is illustrated on a large-scale Riccati equation associated with the stabilization of Navier-Stokes flow around a cylinder.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>This paper introduces and analyzes an efficient algorithm for the solution of the generalized continuous algebraic Riccati equation (GCARE) associated with the solution of linear quadratic regulator (LQR) problems governed by Hessenberg index-2 Differential Algebraic Equations (DAEs). This problem arises, e.g., in the computation of feedback laws that stabilize Navier-Stokes flows. The numerical solution of the Riccati equation is challenging because the underlying systems are large-scale and because of the presence of algebraic constraints in the DAE system. To overcome these challenges we extend our inexact low-rank Newton-ADI method developed in <ref type="bibr">[10]</ref> for problems governed by ordinary differential equations (ODEs) to this DAE case. The main idea is to use the structure of the Hessenberg index-2 DAE and apply the discrete version of the Leray projector (see Heinkenschloss et al. <ref type="bibr">[16]</ref> and B&#228;nsch, et al. <ref type="bibr">[4]</ref>) to transform the LQR problem governed by the DAE into a classical LQR problem governed by an ODE. In principle, the standard LQR and Riccati theory as well as the inexact low-rank Newton-ADI method developed in our previous paper <ref type="bibr">[10]</ref> can be applied to this ODE problem. This, however leads to a solution approach that is not practical because the projected systems are large-scale and, because of the projection, dense. To arrive at an efficient algorithm, the computations must be presented in terms of the original large-scale sparse system and the structure of the governing DAE system must be exploited. This is done in this paper. In addition, numerical experience with our new algorithm indicates that our control of inexactness and the line-search leads to a start-up phase that reaches the quadratic convergence region of the Newton iteration faster and tends to avoid marginally stable subproblems during intermediate iterations.</p><p>The LQR problem and associated Riccati equation considered in this paper have also been solved by B&#228;nsch et al. <ref type="bibr">[4]</ref>. However, the focus of <ref type="bibr">[4]</ref> was the computation of feedback laws for Navier-Stokes flows, and a basic version of an inexact low-rank Newton-ADI method was applied. Our paper focusses on the solution of the Riccati equation and incorporates many recent improvements. As a result, the algorithm in this paper delivers an approximately 90-times speed-up over the algorithm used in <ref type="bibr">[4]</ref>. <ref type="bibr">Benner and Stykel [6]</ref> study the solution of projected Riccati equations, which are associated with DAEs. They use so-called spectral projectors, which project onto the right and left deflating subspaces. While these projectors can be applied to general DAEs defined by a regular pencil, in the general case "the projectors [. . . ] are required in explicit form [and the] computation of these projectors is, in general, very expensive" <ref type="bibr">[6, p. 590</ref>]. The projector used in our paper is specially designed for the index-2 DAE system arising for fluid flow problems and our Kleinman-Newton-ADI method contains many improvements not yet available in <ref type="bibr">[6]</ref>. In principle it is possible to use rational Krylov subspace projection methods (see Simoncini et al. <ref type="bibr">[25,</ref><ref type="bibr">26]</ref>) to solve the Riccati equations, but extensions of this approach to the DAE case and numerical comparisons of the latest versions of both approaches are not yet available.</p><p>As pointed out above, a main ingredient for the efficiency of our approach is the exploitation of the special structure of the Hessenberg index-2 DAE, in what is called implicit index-reduction. Specifically, we can use structured projectors, rather than generic and expensive spectral projectors. Implicit index-reduction can also be applied to other structured DAE systems, see e.g. <ref type="bibr">[11,</ref><ref type="bibr">14,</ref><ref type="bibr">15,</ref><ref type="bibr">24]</ref>. We demonstrate our approach on a large-scale Riccati equation associated with the stabilization of Navier-Stokes flow, but the extension of the techniques described in this paper to other saddle point structured DAEs is straightforward.</p><p>This paper is organized as follows. The next section, Section 2, introduces the LQR problem, uses projection onto the constraint manifold to derive a projected Riccati equation, and reviews existence results for both the projected Riccati equation and the LQR problem. Section 3 reviews the main components of our algorithm in <ref type="bibr">[10]</ref> applied to the projected GCARE and Section 4 carefully exploits the special structure of the projected GCARE for an efficient numerical realization of the inexact low-rank Newton-ADI method. Finally, Section 5 illustrates the performance of our algorithm on a large-scale Riccati equation associated with the stabilization of Navier-Stokes flow around a cylinder -a problem also solved by B&#228;nsch et al. <ref type="bibr">[4]</ref>. As mentioned earlier, the algorithmic improvements in this paper lead to approximately 90-times speed-up over the algorithm used in <ref type="bibr">[4]</ref>.</p><p>Notation. Throughout the paper we consider the Hilbert space of matrices in R n&#215;n endowed with the inner product M, N = tr M T N = n i, j=1 M ij N ij and the corresponding (Frobenius) norm</p><p>1/2 . Furthermore, given real symmetric matrices M, N, we write M N if and only if M -N is positive semi-definite, and M N if and only if M -N is positive definite. The spectrum of a symmetric matrix M is denoted by &#963; (M).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">The LQR problem and the Riccati equation</head><p>In this section we present the mathematical statement of the LQR problem and the governing Hessenberg index-2 DAE, and we show how it can be transformed into a 'standard' LQR problem governed by an ODE using a projection onto the constraint manifold of the original DAE. Then we apply classical LQR theory to this transformed problem to compute, under standard conditions on the system, the solution of the LQR problem via the GCARE. As mentioned before, the problem transformation is performed to derive the solution, but the computations are done using the original DAE framework. The projection used to convert the DAE into an ODE was first used in a different context by Heinkenschloss et al. <ref type="bibr">[16]</ref>. For DAEs derived from a finite element discretization of the Stokes or linearized Navier-Stokes system, B&#228;nsch et al. <ref type="bibr">[4]</ref> show that this projection is a discrete version of the Leray projector. Projections have also been used by <ref type="bibr">Benner,</ref><ref type="bibr">Stykel [6]</ref> to formulate and solve GCAREs associated with index-2 DAEs, although, as already noted in the introduction, the projection there is different. Except for some extensions in problem statement and notation the material in this section is mostly known from <ref type="bibr">[4,</ref><ref type="bibr">16]</ref>, but is needed to provide the necessary background that allows us to switch between expressions using the original DAE system and the corresponding expressions using the transformed ODE system. Compared to <ref type="bibr">[4]</ref>, this section also provides a more detailed link between the representations of the optimal control of the LQR problem derived using the original DAE and transformed ODE system.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">The LQR problem</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Given matrices</head><p>where for given u &#8712; L 2 (0, &#8734;), the function y &#8712; L 2 (0, &#8734;) is obtained as the output of the Hessenberg index-2 Differential</p><p>Algebratic Equation system</p><p>)</p><p>(2.2c)</p><p>To ensure well-posedness of the LQR, we will make additional assumptions on the system (2.2) in Section 2.2. In the cost functional, we may replace the Euclidian norms by any weighted norm induced by positive definite matrices Q y and Q u . Here, we set both weighting matrices to the appropriate identity for ease of notation. It is straightforward to include non-identify weighting matrices into the problem description and the computational framework.</p><p>The LQR problem (2.1), (2.2) arises, e.g., in feedback stabilization of the Navier-Stokes equations, see B&#228;nsch et al. <ref type="bibr">[4]</ref> or Raymond <ref type="bibr">[23]</ref>. In this context, (2.2a), (2.2b) correspond to the linearized discretized Navier-Stokes equations, and v, p correspond to velocity and pressure, respectively. The problem also arises in feedback stabilization of multi-field flow problems, see B&#228;nsch et al. <ref type="bibr">[3]</ref>. In this case, (2.2a) includes additional equations such as linearized reaction equations, and v corresponds to velocities and the other fields, such as concentrations.</p><p>If we define </p><p>(2.4b)</p><p>The structure of (2.2) can be used to convert the LQR problem (2.1), (2.2) into a classical one governed by an ODE. We proceed as in <ref type="bibr">[4,</ref><ref type="bibr">16]</ref>. The constraint (2.2b) and the variable p can be eliminated from (2.2a), (2.2b) via the projection</p><p>(2.5)</p><p>The matrix &#928; obeys &#928; 2 = &#928; and &#928; M = M&#928; T , i.e., it is in fact an M-orthogonal projection. Furthermore, null(&#928; T ) = range(M -1 G) and range(&#928; T ) = null(G T ), <ref type="bibr">(2.6)</ref> which means that 0 = G T v(t) if and only if v(t) = &#928; T v(t).</p><p>We use the latter property to enforce (2.2b) and multiply (2.2a) by &#928; to arrive at</p><p>(2.7b) If needed, the function p can be computed from v, u using</p><p>(2.8) Equation (2.8) is obtained multiplying (2.2a) by G T M -1 and using (2.2b).</p><p>Since &#928; M&#928; T &#8712; R n v &#215;n v has an n p -dimensional null-space and cannot be inverted, (2.7) is still not an ODE. However, &#928; T v(t) &#8712; R n v is contained in the n vn p dimensional subspace range(&#928; T ) and we can explicitly express &#928; T v(t) as an element of this subspace. This is done using the decomposition</p><p>with l , r &#8712; R n v &#215;(n v -n p ) . In particular range( r ) = range(&#928; T ).</p><p>(2.10)</p><p>(2.11)</p><p>Using the decomposition (2.9), we define</p><p>and write the descriptor system (2.7) as</p><p>(2.13a)</p><p>(2.13b)</p><p>The DAE system (2.2) is equivalent to system (2.13), which is an ODE system since with M being symmetric and positive definite so is M, by x T Mx = ( r x) T M( r x) &gt; 0. Furthermore, the LQR problem (2.1), (2.2) is equivalent to the classical LQR problem (2.1), <ref type="bibr">(2.13)</ref>. We summarize this result in the following proposition. </p><p>where</p><p>is the unique stabilizing solution of the GCARE</p><p>(2.15)</p><p>See, e.g., Lancaster, Rodman <ref type="bibr">[19]</ref>.</p><p>The unique stabilizing solution of the GCARE is obtained by applying Newton's method to find a root of the quadratic operator</p><p>(2.16)</p><p>Given an approximate root X (k) , the new approximation is computed as the solution of</p><p>(2.17)</p><p>This method is known as the Kleinman-Newton method. See the original paper by Kleinman <ref type="bibr">[17]</ref> or the book by Lancaster, Rodman <ref type="bibr">[19]</ref>.</p><p>The system (2.17) is a Lyapunov equation and for large-scale problems the exact Kleinman-Newton method which is defined by (2.17) is impractical. This is particularly true for the Riccati equation (2.15) which is obtained from a large-scale DAE by projection. The projected matrices in (2.12) are not only large-scale, but because of the projections they are also dense. To overcome these difficulties, we need to 'undo' the projections in the numerical computations. We will discuss the details of our solution approach in the next section. In the remainder of this section we provide basic relationships between quantities for the projected problem and quantities for the original problem.</p><p>The Kleinman-Newton method applied to the projected Riccati equation (2.15) generates iterates 0</p><p>and corresponding feedback matrices</p><p>(2.18)</p><p>We want to write the corresponding feedback law -(</p><p>and</p><p>The convergence of the (exact) Kleinman-Newton method can now be expressed in the unprojected variables and in the context of the (2.2). First we show that the stability (detectability) of the system (2.13) is equivalent to the stability (detectability) of the system (2.2).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Definition 2.</head><p>1. A matrix pencil (A, M) is called stable if it is regular and all the finite eigenvalues of (A, M) lie in the open left half-plane.</p><p>2. Let A, B, M be given by (2.12). The triple (A, B; M) is stabilizable if there exists a matrix K &#8712; R n v &#215;n u such that all finite eigenvalues of the matrix pencil</p><p>The following result is proven in <ref type="bibr">[27,</ref><ref type="bibr">Lemma 4.4</ref>].</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Lemma 3. The matrix triple (A, B; M) is stabilizable ((C, A; M) is detectable) if and only if (A, B; M) is stabilizable ((C, A; M) is detectable).</head><p>With these preparations, the following result is an immediate consequence of the classical Kleinman-Newton convergence result <ref type="bibr">[17]</ref>, <ref type="bibr">[19]</ref>. See <ref type="bibr">[27,</ref><ref type="bibr">Thm. 4.5]</ref> for a detailed proof.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Theorem 4. Assume (A, B; M) is stabilizable and (C, A; M) is detectable. There exists a maximal symmetric solution X</head><p>and there is a constant &#954; such that X (k+1) -X ( * )</p><p>r is the solution of the Riccati equation specified in Theorem 4 and</p><p>In principle, the large-scale projected GCARE (2.15) can be solved using the Kleinman-Newton method <ref type="bibr">[17]</ref>. However, the size and special structure of (2.15) require the inexact solution of the Newton equation, a Lyapunov equation, in each step of the Kleinman-Newton method. Moreover, the explicit use of the large, dense projected matrices M, A, B, C (2.12) must be avoided in computations and the final algorithm must operate with the sparse matrices M, A, B, C (2.3) instead. To adopt our approach from <ref type="bibr">[10]</ref> to efficiently solve the large-scale projected GCARE (2.15), we first need to review the main components of our approach in <ref type="bibr">[10]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Inexact Kleinman-Newton for algebraic Riccati equations</head><p>Our approach in <ref type="bibr">[10]</ref> is based on an inexact Kleinman-Newton method with line search. Although the exact and, under additional conditions, inexact Kleinman-Newton method converges with step size fixed to one (see, e.g., Kleinman <ref type="bibr">[17]</ref> or Feitzinger et al. <ref type="bibr">[13]</ref>), variable step sizes can hugely improve the performance (Benner, Byers <ref type="bibr">[5]</ref>, Benner et al. <ref type="bibr">[10]</ref>). We will also observe this in our numerical tests, see Fig. <ref type="figure">2</ref> in Section 5. The line search method and analysis in <ref type="bibr">[5]</ref> are based on exact Lyapunov equation solves, which guarantees that some favorable properties of the Kleinman-Newton iterates are automatically preserved. Our paper <ref type="bibr">[10]</ref> extends line search algorithms and their analyses to inexact solves. An inexact Kleinman-Newton method without line search is analyzed in <ref type="bibr">[13]</ref>, but some assumptions made in <ref type="bibr">[13]</ref> do not hold when low-rank methods are applied to solve the Lyapunov equation iteratively. We extended the inexact Kleinman-Newton method and analysis to integrate the efficient low-rank ADI solver in <ref type="bibr">[10]</ref>. This section reviews the main algorithmic components of <ref type="bibr">[10]</ref> applied to the projected GCARE (2.15). The following Section 4 then carefully exploits the special structure of the projected GCARE (2.15) for an efficient numerical realization.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Inexact Kleinman-Newton method</head><p>At its core our method is an inexact Newton method applied to the GCARE R(X ) = 0, where R(X ) is the Riccati residual <ref type="bibr">(2.16)</ref>. Given an approximate solution X (k) &#8712; R (n v -n p )&#215;(n v -n p ) and a so-called forcing parameter &#951; k &#8712; (0, 1), we compute a</p><p>Then we compute a step size &#958; k &#8712; (0, 1] such that the sufficient decrease condition</p><p>is satisfied, where &#946; &gt; 0 is a given parameter. The new iterate is</p><p>(3.3)</p><p>We will discuss below how we compute an S (k) that satisfies (3.1). As we have shown in <ref type="bibr">[10]</ref>, if the forcing parameters in (3.1) are limited by</p><p>then the sufficient decrease condition (3.2) is satisfied for all step sizes &#958; k</p><p>To ensure convergence of the sequence of iterates {X (k) }, the step sizes &#958; k also need to be bounded away from zero. We will state the precise convergence result later, see Theorem 6 below. We use the Armijo rule to compute the step sizes &#958; k . This step size rule and others are discussed in <ref type="bibr">[10]</ref>, as well as conditions that ensure &#958; k &#8805; &#958; min &gt; 0 for all k. Instead of computing the new iterate S (k) as an approximate solution of R (X (k) )S (k) = -R(X (k) ), it is more favorable for our purposes to compute</p><p>as an approximate solution of R (X (k) ) (k) are Lyapunov equations, but the right hand side of the latter equation,</p><p>where K (k) is defined in (2.18), is low-rank and this will allow the application of the efficient low-rank ADI method (discussed in the next section) to compute X (k+1) . Note that</p><p>where</p><p>We define the projected Lyapunov residual at any X (k+1) by</p><p>(3.7)</p><p>The inexactness condition (3.1) means that we have to compute X (k+1) with</p><p>such that the corresponding projected Lyapunov residual satisfies</p><p>(3.9)</p><p>Using the definition (2.16), (3.5), and (3.8), the residual of the projected CARE at (3.3) can be written as</p><p>which can be evaluated efficiently for any &#958; k , and therefore can be used to efficiently compute a step size &#958; k &gt; 0 that satisfies (3.2). The inexact Kleinman-Newton method with line search is summarized in Algorithm 1 below.</p><p>Algorithm 1 Inexact Kleinman-Newton method with line search.</p><p>Input: A, M, B, C, tol Newton , initial stabilizing iterate X (0) , &#951; &#8712; (0, 1), and &#946; &#8712; (0, 1 -&#951;) Output: Approximate unique stabilizing solution X ( * ) of GCARE (2.15)</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>5:</head><p>Select &#951; k &#8712; (0, &#951;].</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>6:</head><p>Compute X (k+1) that solves the inexact Lyapunov equation</p><p>The following convergence theorem for the iterates generated by Algorithm 1 is adopted from <ref type="bibr">[10,</ref><ref type="bibr">Thm. 10]</ref> to match the notation of the projected Riccati equation (2.15). Theorem 6. Let (A, B; M) be stabilizable, let (C, A; M) be detectable and assume that for all k, there exists a symmetric positive semi-definite X (k+1) such that <ref type="bibr">(3.8</ref>) and (3.9) hold. Furthermore, let X (k) be the iterates generated by Algorithm <ref type="figure">1</ref> and<ref type="figure">A</ref> </p><p>(i) If the step sizes are bounded away from zero, i.e., &#958; k &#8805; &#958; min &gt; 0 for all k, then R(X (k) ) F &#8594; 0. (ii) If in addition the pencils (A (k) , M) are stable for k &#8805; k 0 , and X (k) 0 for all k &#8805; k 0 , then X (k) &#8594; X ( * ) , where X ( * ) 0 is the unique stabilizing solution of the GCARE (2.15).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Improved low-rank ADI method</head><p>The main expense in the inexact Kleinman-Newton Algorithm 1 is in Step 3.11b. We apply the real low-rank ADI method, which is detailed in <ref type="bibr">[10]</ref> and in <ref type="bibr">[27,</ref><ref type="bibr">Sec. 6.3.1]</ref>. This method generates a low-rank approximate solution X (k+1) of the Lyapunov equation in factored form. Compared to the original version of the ADI method <ref type="bibr">[7,</ref><ref type="bibr">20]</ref>, which is also the version used in B&#228;nsch et al. <ref type="bibr">[4]</ref>, we use two important modifications of the original ADI method. The first reorganizes the computation to obtain a low-rank representation of the Lyapunov residual in the ADI iterations <ref type="bibr">[8]</ref>, and the second exploits the fact that the ADI shifts must occur either as a real number or as a pair of complex conjugate numbers to write almost all<ref type="foot">foot_0</ref> matrices in the ADI iterations as real matrices <ref type="bibr">[8]</ref>. Most importantly, the improved method generates real matrices Z and W , each with few columns, such that ZZ T = X (k+1) satisfies (3.11a) and the corresponding Lyapunov residual L( X (k+1) ) = W W T obeys <ref type="bibr">(3.11)</ref>. We refer to <ref type="bibr">[10]</ref> or <ref type="bibr">[27,</ref><ref type="bibr">Sec. 6.3.1]</ref> for details on the derivation of the real low-rank ADI method. The detailed listing of this method is given in Algorithm 2 below.</p><p>Algorithm 2 Generalized real-valued low-rank residual ADI method.</p><p>W -1</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>4:</head><p>if Im (q ) = 0 then 5:</p><p>&#8730; -Re (q ), &#948; = Re (q ) / Im (q )</p><p>9:</p><p>10:</p><p>= + 1 15: end while Algorithms 1 and 2 work with the projected matrices, but need to be implemented operating on the matrices <ref type="figure">M,</ref><ref type="figure">A,</ref><ref type="figure">B,</ref><ref type="figure">C</ref>. This transformation will be described in the next section.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Inexact Kleinman-Newton for algebraic Riccati equations associated with index-2 DAEs</head><p>The inexact Kleinman-Newton Algorithm 1 and the improved ADI Algorithm 2 are derived and stated in terms of the projected matrices in (2.12). As stated before, these matrices are dense, expensive to compute with and the explicit use of the projection needs to be avoided. As before, we use calligraphic font, like X (k) , to denote projected quantities, and roman font, like X (k) , to denote the corresponding quantities without projection.</p><p>Regarding the transformation of the iterates in the inexact Kleinman-Newton Algorithm 1, we already know from (2.19) that</p><p>To undo the projections, we multiply the Lyapunov equations and the Riccati residuals from the left by l and from the right by T l and replace Steps 6 and 7 in Algorithm 1 by the following.</p><p>6: Compute X (k+1) that solves the inexact Lyapunov equation The reason for multiplying by l and T l is that the projection &#928; emerges. In fact, using (2.12), (2.9), and (4.1), the left hand side in (4.2a) becomes</p><p>where</p><p>Although the projection &#928; emerges in (4.3), it will not be computed and used explicitly. We outline the main ideas in the following subsections.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Low-rank residual ADI for index-2 DAE systems</head><p>Recall (2.9) and (2.12). We have</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>.4)</head><p>To transform the matrices in the improved ADI Algorithm 2 we set</p><p>Using (2.12) and (4.1), the linear system in Step 3 of Algorithm 2 is transformed into</p><p>We define</p><p>(4.7a)</p><p>From (2.9) it follows that</p><p>(4.7b)</p><p>Finally, multiplying (4.6) by l from the left, using (2.9), (4.7a) and (4.7b), the linear system in Step 3 of Algorithm 2 is written as</p><p>As it is shown by Heinkenschloss et al. <ref type="bibr">[16]</ref> and B&#228;nsch et al. <ref type="bibr">[4]</ref> the solution of the projected system (4.8) is equivalent to the solution of the 2 &#215; 2 block system</p><p>where " * " indicates that the second block of the solution matrix is not needed. Finally, since K (k) B T is dense, the matrix in (4.9) is written as a low-rank perturbation</p><p>and the solution of (4.9) is computed using the Sherman-Morrison-Woodbury formula. See B&#228;nsch et al. <ref type="bibr">[4]</ref> or Weichelt <ref type="bibr">[27, p. 67</ref>]. We use (2.9), (4.5) to write the projected Lyapunov residual</p><p>(4.10)</p><p>Rather than computing W and then multiplying by &#928; , we can update W = &#928; W directly. In fact, multiplying line 5 in Algorithm 2 with l from the left and using (4.5), (4.7a) yields</p><p>where in the last step we have used the M-orthogonality of &#928; , i.e., &#928; M = M&#928; T and (4.7b). Thus, the projected low-rank residual factor can be accumulated via</p><p>without using any explicit projections. Only the initial right hand side W (k) needs to be projected to define</p><p>(4.12)</p><p>This one projection at the beginning of the ADI method is computed by first solving</p><p>(again, " * " indicates that the second block of the solution is not used) and then setting</p><p>See Heinkenschloss et al. <ref type="bibr">[16]</ref> or Weichelt <ref type="bibr">[27,</ref><ref type="bibr">Lemma 4.1]</ref>. This projection is less expensive than a single ADI step and does not considerably increase the overall computation costs. Moreover, the right-hand side W -1 in (4.8), (4.9) can be replaced by W -1 , since</p><p>To incorporate this improved ADI method into Algorithm 1, some remaining issues, such as the storage of the Newton step and the projected Riccati residual, need to be addressed. This is done in the next subsection, expanding the statements in [10, Sec. 5.2].</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Low-rank Riccati residual for index-2 DAE systems</head><p>The Newton step S (k) = r S (k) T r is only used in the computation of the step size &#958; k , since the inexact Kleinman-Newton step (3.8) directly iterates over the preliminary solution <ref type="bibr">(3.5)</ref>, and the definition of the feedback matrix in <ref type="bibr">(2.14)</ref>, this product can be written as</p><p>which characterizes the feedback change corresponding to the preliminary or definite new iterate X (k+1) or X (k+1) . Using (2.12), (4.1), and</p><p>which characterizes the feedback change corresponding to the preliminary new iterate or X (k+1) or new iterate X (k+1) . Hence, the dense Newton step S (k) is never formed explicitly.</p><p>The definition X (k+1) = ZZ T and update in Step 13 of Algorithm 2 implies the formula    (k+1) .</p><p>The Riccati residual can be written in low-rank form as</p><p>This representation can be used to efficiently compute l R(X (k) ) T l F .</p><p>In the initial iteration k = 0 with X (0) = 0, (4.18) holds with W (0) = C T and K (k) = 0. Equation (3.10) and L( X</p><p>which is of the form (4.18) with</p><p>Using (4.5), <ref type="bibr">(4.19)</ref> the projected Riccati residual R(X (k+1) ) := l R(X (k+1) ) T l can be written as k+1) . In the second to last equation in (4.20) we have used the identity</p><p>which follows from the M-orthogonality of &#928; and &#928; T (X (k+1) -</p><p>Algorithm 3 Inexact low-rank Kleinman-Newton-ADI for index-2 DAE systems.</p><p>Input: M, A, G, B, C , initial feedback K (0) , tol Newton , &#951; &#8712; (0, 1), and &#946; &#8712; (0, 1 -&#951;)</p><p>Compute ADI shifts {q i } nADI i=1 = {q i } nADI i=1 &#8834; C -ordered such that complex pairs form consecutive entries and choose &#951; k &#8712; (0, &#951;].</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>5:</head><p>Set</p><p>if Im (q ) = 0 then 10:</p><p>13: else 14:</p><p>15:</p><p>16:</p><p>= + 1 18:</p><p>end while 23:</p><p>Compute &#958; k &#8712; (0, 1) using, e.g., the Armijo rule.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>25:</head><p>else 26:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>27:</head><p>end if 28:</p><p>31:</p><p>where K (0) is an initial stabilizing feedback. Equation <ref type="bibr">(4.20)</ref> shows that the Riccati residual R(X (k+1) ) can be computed without any additional explicit projection. The representation <ref type="bibr">(3.10)</ref> shows that R(X (k) + &#958; k S (k) ) is a quartic polynomial with scalar coefficients. Just as in <ref type="bibr">[10,</ref><ref type="bibr">Sec. 5]</ref> this is used for an efficient implementation of the line search computation.</p><p>The final feedback at the end of the k + 1-st Newton step is defined via</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>23)</head><p>Only the feedback matrix is needed, but if desired the Riccati iterate can be computed in low-rank form as follows. Assuming the previous Riccati iterate is defined via X (k) = Z (k) Z (k) T and the preliminary solution is defined via X (k+1) = Z (k+1) Z (k+1) T , the new Riccati iterate can be written as</p><p>whose size depends on the number of ADI steps in the k-th and k+1-st Newton iteration. The entire process of the inexact low-rank KN-ADI method is depicted in Algorithm 3.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Numerical experiments</head><p>We illustrate the benefits of Algorithm 3 to solve the GCARE associated with the solution of LQR problem (2.1), (2.2) governed by the linearized Navier-Stokes equation. Since our problem set-up is identical to that in the paper by B&#228;nsch et  al. <ref type="bibr">[4]</ref> and in Weichelt's PhD Thesis <ref type="bibr">[27]</ref>, we only sketch it here and refer to <ref type="bibr">[4,</ref><ref type="bibr">27]</ref> for details. Additional numerical results can be found in <ref type="bibr">[27]</ref>. The domain on which the Navier-Stokes and linearized Navier-Stokes equations are posed is shown in Fig. <ref type="figure">1</ref>. Inflow boundary conditions are posed on the left boundary, no-slip conditions are posed on part of the cylinder boundary and on the top and bottom boundary, and outflow conditions are imposed on the right boundary. Controls are applied on two segments on the cylinder wall (indicated by feed,1 , feed,2 ). Specifically, for each segment a spatial profile is specified, so that the number of inputs in (2.2) is n u = 2. As described in detail in <ref type="bibr">[4,</ref><ref type="bibr">Sec. 2.7]</ref>, <ref type="bibr">[27,</ref><ref type="bibr">Sec. 4.1.3]</ref>, an operator is constructed that converts these Dirichlet boundary controls to distributed controls, such that</p><p>2). The observations are chosen to be the vertical velocities of the linearized Navier-Stokes equations at the seven points indicated by P obs,i . Thus, y(t) &#8712; R 7 , n y = 7. Moreover, we penalize the output by &#945; &gt; 0, i.e., the output equation (2.2c) takes the concrete form y(t) = &#945;Cv(t) with C &#8712; R 7&#215;n v specified in [4, p. A855], <ref type="bibr">[27,</ref><ref type="bibr">Sec. 4.4.1]</ref>.</p><p>The solution to the steady state Navier-Stokes equation around which is linearized, as well as the linearized Navier-Stokes equations, i.e., the matrices in (2.1), (2.2) are computed using the finite element flow solver NAVIER <ref type="bibr">[1]</ref>, which uses P 2 -P 1</p><p>Taylor-Hood elements and is written in FORTRAN90. The matrices in (2.1), (2.2) are generated using NAVIER and then stored using the so-called matrix market format <ref type="bibr">[12]</ref>. The computations for the resulting matrix equations are performed with MATLAB R2012b on a 64-bit CentOS 5.5 server with Intel Xeon X5650 at 2.67 GHz, with 2 CPUs, 12 cores (6 cores per CPU), and 48 GB main memory available.</p><p>We conduct experiments with Reynolds number Re = 100, 200, 300, 400, 500, and we use six finite element discretization levels, with Level 1 being the coarsest (shown in Fig. <ref type="figure">1</ref>). The matrix sizes corresponding to these discretizations are listed in Table <ref type="table">1</ref>.</p><p>For larger Reynolds number, the matrix pencil (A, M) is not stable (see [4, Fig. <ref type="figure">2</ref>], <ref type="bibr">[27,</ref><ref type="bibr">Sec. 4.2.3]</ref>) and a nonzero initial feedback is needed. We construct the initial feedback K (0) as specified in [4, Sec. 2.7], <ref type="bibr">[27,</ref><ref type="bibr">Sec. 4.2.3]</ref>. First, we illustrate the impact of the line search. Fig. <ref type="figure">2</ref> shows the convergence of the 'exact' Kleinman-Newton method (i.e., the Lyapunov equation is solved with fixed high residual tolerance) and the inexact Kleinman-Newton method (Algorithm 1 with &#951; k = min{0.1, 0.9 &#8226; R(X (k) ) F }) both with and without line search for the LQR problem governed by the discretized linearized Navier-Stokes equations with Re = 500, output weight &#945; = 10 4 , and discretization level 1. There is little difference in the Riccati residuals between the exact and the inexact Kleinman-Newton method. However, there is a big difference between the method with and without line search. Without line search the relative residual grows dramatically in the initial (k = 0) iteration. With line search, the line search is active &#958; k &lt; 1 for iterations k = 0, 1, 2 (exact Kleinman-Newton) and iterations k = 0, 1 (inexact Kleinman-Newton). Fig. <ref type="figure">2</ref> also shows the Riccati residuals corresponding to &#958; k = 1 for the iterations where the line search is active. That the line search is typically only active in the first few iterations has also been observed in other applications of Riccati equations (see, e.g., <ref type="bibr">[5]</ref>).</p><p>Next, we illustrate the influence of the various improvements to the overall performance of the Algorithm 3. Specifically we compare five set-ups, where 'Setup i' corresponds to a basic version of the Kleinman-Newton-ADI method, and 'Setup v'  Finally, adding the line-search in Setup v improves the method further. The number of ADI iterations and linear system solves is reduced by a factor of two. The reduction in ADI iterations also reduced the time for the shift computation. The line search is less than one only in the first iteration and the cost of step size computation is negligible. Comparing the total computation times shows that the algorithm specified in Setup v is 90-times faster than the algorithm specified in Setup i.</p><p>As we will see next, the solution of the Riccati equation becomes more difficult as the output weighting &#945; increases. In those cases the speedup of Setup v over Setup i is even more important.</p><p>The following numerical tests focus on Algorithm 3, with Setup v. As mentioned before, Algorithm 3 with Setup v will be referred to as 'iKNqLS'. Table <ref type="table">3</ref> documents the performance of iKNqLS applied to our test problem as Reynolds number Re, output weight &#945;, and discretization level change. Table <ref type="table">3a</ref> shows that the number of Kleinman-Newton iterations increases moderately with an increasing &#945; and increasing Reynolds number. Similarly, the number of total ADI iterations needed to approximately solve the Lyapunov equations increases with an increasing &#945; and increasing Reynolds number. Furthermore, line search is only necessary for higher Reynolds numbers and higher output weights.</p><p>Table <ref type="table">3b</ref> shows that for Re &#8804; 300, the number of Newton iterations remains nearly constant as the refinement level is increased. For Re &#8804; 300 and refinement level greater than two the number of iterations where the step size is less than one is unusually large. We believe that this effect is a result of the instability of the matrix pencil. Solving the first Newton step inexactly might yield an intermediate solution that is slightly (especially in finite precision arithmetic) not stabilizing. Therefore, the following ADI iteration tends to diverge. Our algorithm detects this behavior by monitoring the Riccati and Lyapunov residual continuously. Although this behavior is not covered by the convergence proof in Theorem 6, where a stabilizing solution for k &#8805; k 0 is required, our algorithm handles this situation by deleting the last ADI step and performing a line search. This yields convergence in all examples we considered. The relative Riccati residual seems to stagnate for a couple of iterations and, hence, an increasing amount of line search runs is required.</p><p>Convergence theory for the exact Kleinman-Newton method guarantees that the matrix pencils are stable if the initial matrix pencil is stable. Thus, another approach to circumvent the appearance of a possibly unstable pencil arising from inexact Lyapunov equation solution is to use a smaller fixed ADI tolerance for the first Newton step. Rather than using the Lyapunov residual tolerance tol ADI = &#951; k R(X (k+1) ) , we set tol ADI = 10 -2 for the first two Newton iterations. If the relative Riccati residual decreases and drops below 5 &#8226;10 -1 , the method switches to the iKNqLS scheme (i.e., tol ADI = &#951; k R(X (k+1) ) ). This is referred to as 'exact' start. Using the tol ADI = &#951; k R(X (k+1) ) in all iterations is referred to as inexact start. Table <ref type="table">4</ref> compares both starting procedures for Re &#8805; 300 and refinements Level 3-6. The 'exact' start prevents the stagnation of the relative Riccati residual and reduces the number of Newton iterations. The line search is used in at most one iteration. However, the 'exact' solves in the first Newton iterations increase the number of ADI iterations. Therefore, in most cases a decrease in the number of Newton iterations does not translate into a significant decrease in the total number of ADI iterations (and therefore significant decrease in overall computing time) when the 'exact' start is used.</p><p>Overall iKNqLS is able to solve the Riccati equation in all cases. Although there is no theoretical justification, our numerics indicate that the inclusion of line search and computationally inexpensive monitoring of the low-rank Riccati and Lyapunov residuals enables the algorithm to successfully cope with intermediate iterates that are nearly not stabilizing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusions</head><p>We have extended our inexact Kleinman-Newton method low-rank ADI solver and line search from <ref type="bibr">[10]</ref> to Riccati equations governed by Hessenberg index-2 DAEs. Using the projection idea from Heinkenschloss et al. <ref type="bibr">[16]</ref> and B&#228;nsch, Benner <ref type="bibr">[2]</ref> we transform the problem governed by the DAE into a 'classical' problem governed by an ODE. Our algorithm in <ref type="bibr">[10]</ref> is then applied to this transformed problem. However, the projected ODE is never computed in practice. Instead, a careful exploitation of the problem structure allows the formulation of the algorithm in the original DAE context. We have demonstrated the performance of our Riccati solver to a problem arising in feedback stabilization of Navier-Stokes flow around a cylinder. The numerical results document the impact of various algorithmic components on the overall performance. The algorithmic improvements in this paper lead to approximately 90-times speed-up over a previously used Kleinman-Newton-ADI method. Moreover, we have explored the performance of the new algorithm for various Reynolds numbers, mesh refinement levels and output weights. The new algorithm was able to solve all instances. Moreover, although there is no theoretical justification, our numerics indicate that the inclusion of line search and computationally inexpensive monitoring of the low-rank Riccati and Lyapunov residuals enables the new algorithm to successfully cope with intermediate iterates that are (slightly) not stabilizing.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_0"><p>The linear system solve still has a complex coefficient matrix and thus the intermediate V is complex. This can be avoided along the lines of<ref type="bibr">[18,</ref>   Remark  </p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_1"><p><ref type="bibr">4</ref>.4], but is not done in our implementation.</p></note>
		</body>
		</text>
</TEI>
