<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Randomized Greedy Algorithms for Sensor Selection in Large-Scale Satellite Constellations</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>05/31/2023</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10440126</idno>
					<idno type="doi">10.23919/ACC55779.2023.10156009</idno>
					<title level='j'>American Control Conference, {ACC}</title>
<idno></idno>
<biblScope unit="volume"></biblScope>
<biblScope unit="issue"></biblScope>					

					<author>Michael Hibbard</author><author>Abolfazl Hashemi</author><author>Takashi Tanaka</author><author>Ufuk Topcu</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[We study the problem of estimating a random process from the observations collected by a network of sensors that operate under resource constraints. When the dynamics of the process and sensor observations are described by a state-space model and the resource are unlimited, the conventional Kalman filter provides the minimum mean-square error (MMSE) estimates. However, at any given time, restrictions on the available communications bandwidth and computational capabilities and/or power impose a limitation on the number of network nodes whose observations can be used to compute the estimates. We formulate the problem of selecting the most informative subset of the sensors as a combinatorial problem of maximizing a monotone set function under a uniform matroid constraint. For the MMSE estimation criterion we show that the maximum element-wise curvature of the objective function satisfies a certain upper-bound constraint and is, therefore, weak submodular. We develop an efficient randomized greedy algorithm for sensor selection and establish guarantees on the estimator's performance in this setting. Extensive simulation results demonstrate the efficacy of the randomized greedy algorithm compared to state-of-the-art greedy and semidefinite programming relaxation methods.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>placement in power networks <ref type="bibr">[9]</ref>- <ref type="bibr">[11]</ref>, sensor scheduling in wireless sensor networks <ref type="bibr">[2]</ref>, <ref type="bibr">[12]</ref>, and subset selection in machine learning <ref type="bibr">[13]</ref>.</p><p>For a variety of performance criteria, finding an optimal subset of sensors requires solving a computationally challenging combinatorial optimization problem, possibly using branch-and-bound search <ref type="bibr">[14]</ref>. By reducing it to the set cover problem, sensor selection was in fact shown to be NP-hard <ref type="bibr">[15]</ref>. This hardness result has motivated development of numerous heuristics and approximate algorithms. For instance, <ref type="bibr">[16]</ref> formulated the sensor selection problem as the maximization (minimization) of the log det of the Fisher information matrix (error covariance matrix), and found a solution by relaxing the problem to a semidefinite program (SDP). The computational complexity of finding the solution to the SDP relaxation of the sensor selection problem is cubic in the total number of available sensors, which limits its practical feasibility in large-scale networks consisting of many sensing nodes. Moreover, the solution to the SDP relaxation comes with no performance guarantees. To overcome these drawbacks, Shamaiah et al. <ref type="bibr">[3]</ref> proposed a greedy algorithm guaranteed to achieve at least (1 1/e) of the optimal objective at a complexity lower than that of the SDP relaxation. The theoretical underpinnings of the greedy approach to the sensor selection problem in <ref type="bibr">[3]</ref> are drawn from the area of submodular function optimization. In particular, these results stem from the fact that the logarithm of the determinant (log det) of the Fisher information matrix is a monotone submodular function. Nemhauser et al. <ref type="bibr">[17]</ref> studied maximization of such a function subject to a uniform matroid constraint and showed that the greedy algorithm, which iteratively selects items providing maximum marginal gain, achieves (1 1/e) approximation factor. More recently, <ref type="bibr">[4]</ref>- <ref type="bibr">[6]</ref>, <ref type="bibr">[8]</ref>, employed and analyzed greedy algorithms for finding approximate solutions to the log det maximization problem in a number of practical settings.</p><p>Most of the existing work on greedy sensor selection has focused on optimizing the log det of the Fisher information matrix, an objective indicative of the volume of the confidence ellipsoid. However, this criterion does not explicitly relate to the mean-square error (MSE) which is often a natural performance metric of interest in estimation problems. The MSE, i.e., the trace of the covariance matrix of the estimation error, is not supermodular <ref type="bibr">[18]</ref>- <ref type="bibr">[23]</ref>. Therefore, its negative value, which we would like to maximize, is not submodular. Consequently, the setting and results of <ref type="bibr">[17]</ref> do not apply to the MSE minimization problem. Recently, Wang et al. <ref type="bibr">[24]</ref> analyzed performance of the greedy algorithm in the general setting of maximizing a monotone non-decreasing objective function that is not necessarily submodular. They used a notion of the total curvature &#181; of the objective function to show that the greedy algorithm provides a ((1 + &#181;) 1 )-approximation under a matroid constraint. However, determining the elemental curvature defined in <ref type="bibr">[24]</ref> is itself an NP-hard problem. Therefore, providing performance guarantees for the settings where the objective function is not submodular or supermodular, such as the trace of the covariance matrix of the estimation error in the sensor selection problem, remains a challenge.</p><p>On another note, processing massive amounts of data collected by modern large-scale networks may be challenging even for greedy algorithms. To further reduce the computational burden of maximizing a monotone increasing, submodular function subject to cardinality constraints, the authors of <ref type="bibr">[13]</ref> proposed a stochastic greedy algorithm that achieves (1 1/e &#9999;)-approximation factor, where &#9999; denotes a parameter that can be varied to explore the performance-complexity trade-off. However, the results of <ref type="bibr">[13]</ref> do not apply to the sensor selection problem under The objective is to select a small subset of range and angular measurements gathered by the UAVs to communicate to the control unit.</p><p>the (non-submodular) MSE objective.</p><p>In this paper, we formulate the task of selecting sensors in a large-scale network as the problem of maximizing a monotone non-submodular objective function that is directly related to the mean-square estimation error. By analyzing curvature of the objective function, we derive sufficient conditions under which the function is weak submodular. In the important setting where a state-space model describes the dynamics of the process and sensor observations, we propose a randomized greedy algorithm and find a bound on the MSE of the state estimate formed by the Kalman filter that uses the measurements of the sensors selected by the proposed algorithm. A further implication of these results is that, when the measurement vectors are Gaussian or Bernoulli as frequently encountered in reduced-dimensionality Kalman filtering using random projections <ref type="bibr">[25]</ref>, the MSE objective is weak submodular with high probability. Our extensive simulations demonstrate that the proposed randomized greedy sensor selection scheme significantly outperforms both greedy and SDP relaxation methods in terms of computational complexity, and hence runtime, while providing essentially the same or improved MSE performance.</p><p>The rest of the paper is organized as follows. Section II presents a motivating example and sets up the system model. In Section III we describe the novel formulation of the sensor selection problem and derive a bound on the curvature of the MSE-related objective function. In Section IV, we introduce the randomized greedy algorithm and analyze its performance. Section V presents the simulation results while Section VI states the concluding remarks.</p><p>Notation: Bold capital letters denote matrices while bold lowercase letters represent vectors. H k (i, j) is the (i, j) entry of the time-varying matrix H k at time k, h k,j is the j th row of H k , H k,S is a submatrix of H k that consists of the rows of H k indexed by the set S, and max (H k ) and min (H k ) are the maximum and minimum eigenvalues of H k , respectively. Spectral (`2) norm of a matrix is denoted by k.k. I n 2 R n&#8677;n is the identity matrix. Moreover, let [n] := {1, 2, . . . , n}.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>II. SYSTEM MODEL AND PROBLEM FORMULATION</head><p>This section starts by a description of a motivating example of multi-object tracking under communication and power constraints. Then we proceed to define the system model and mathematically formulate the sensor selection problem studied in the paper.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Motivating example: Accelerated multi-object tracking</head><p>Consider a tracking system, shown in Fig. <ref type="figure">1</ref>, where a control unit surveys an area via a swarm of unmanned aerial vehicles (UAVs). The UAVs are equipped with GPS and radar systems and can communicate with each other over locally established communication channels. However, only a few of the UAVs known as swarm leaders are allowed to communicate to the control unit because of various practical restrictions such as power constraints. The UAVs patrol the area according to a predefined search pattern to gather information about the location of mobile objects of interest. Each UAV, by using its radar system, acquires range and angular measurements of all the objects that are within the maximum radar detection range and are capable of transmitting those measurements to the swarm leaders. Due to limitations on the rate of communication between the swarm leaders and the control unit, and to reduce delays in tracking from high computation, only a subset of the gathered measurements is communicated to the control unit. In order to track the locations of the object, the control unit employs Kalman filtering using the received measurements. Therefore, the goal of swarm leaders is to perform sensor scheduling and select a subset of range and angular measurements such that (i) the communication constraint is satisfied, and (ii) the mean-square error of the Kalman filter estimate of the objects' locations is minimized.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. System model</head><p>Consider a discrete-time, linear, time-varying state-space model described by</p><p>where x k 2 R m is the state vector at time k that we aim to estimate, y k 2 R n is the measurement vector, w k 2 R m and v k 2 R n are zero-mean white Gaussian noise processes with covariances Q k and R k , respectively, A k 2 R m&#8677;m is the state transition matrix and H k 2 R n&#8677;m is the matrix whose rows at time k are the measurement vectors h k,i 2 R m , 1 &#63743; i &#63743; n. We assume the states x k are uncorrelated with w k and v k . Additionally, we assume</p><p>I m , and R k = diag( 2 1 , . . . , 2 n ). Note that, unlike the past work on greedy sensor selection in <ref type="bibr">[3]</ref>, <ref type="bibr">[12]</ref>, <ref type="bibr">[26]</ref>, <ref type="bibr">[27]</ref>, this model does not restrict the noise covariance matrix to be a multiple of identity.</p><p>Due to limited resources, fusion center aims to select K out of n sensors and use their measurements to estimate the state vector x k such that the trace of the covariance matrix of the estimation error, i.e., the MSE of the estimator implemented using the Kalman filter is minimized. Similar to prior work in <ref type="bibr">[3]</ref>, <ref type="bibr">[12]</ref>, <ref type="bibr">[16]</ref>, we assume that the measurement vectors h k,i are available at the fusion center. Let xk|k 1 and xk|k denote the predicted and filtered linear minimum mean-square error (LMMSE) estimators of x k , respectively. In other words, xk|k 1 is the LMMSE estimator of x k given {y S1 , . . . , y S k 1 } and xk|k is the LMMSE estimator of x k given {y S1 , . . . , y S k }, where S j denotes the set of sensors selected at time j and y Sj denotes the vector of measurements collected by those sensors. Moreover, let P k|k 1 and P k|k denote the predicted and filtered error covariance matrix of the Kalman filter at time instant k, respectively, i.e.,</p><p>where</p><p>) and the measurements are uncorrelated across sensors, it holds that</p><p>Furthermore,</p><p>is the corresponding Fisher information matrix. In the information form, the filtered estimator of x k is expressed as</p><p>The MSE of the estimate found in ( <ref type="formula">2</ref>) is given by the trace of the filtered error covariance matrix P k|k :</p><p>To minimize (3), at each time step the fusion center seeks a solution to the optimization problem min</p><p>By a reduction to the well-known set cover problem, the combinatorial optimization (4) can be shown to be NP-hard <ref type="bibr">[15]</ref>, <ref type="bibr">[28]</ref>. In principle, to find the optimal solution one needs to exhaustively search over all schedules of K sensors.</p><p>The techniques proposed in <ref type="bibr">[16]</ref>, albeit for an optimality criterion different from MSE and a simpler measurement model, find a subset of sensors that yields a sub-optimal MSE performance while being computationally much more efficient than the exhaustive search. In particular, <ref type="bibr">[16]</ref> relies on finding the solution to the following SDP relaxation: min</p><p>The complexity of the SDP algorithm scales as O(n 3 ) which is infeasible in many practical settings. Furthermore, there are no guarantees on the achievable MSE performance of the SDP relaxation. Note that when the number of sensors in a network and the size of the state vector x k are relatively large, even the greedy algorithm proposed in <ref type="bibr">[3]</ref> may be computationally prohibitive.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>III. SENSOR SELECTION VIA OPTIMIZING A WEAK SUBMODULAR OBJECTIVE</head><p>Leveraging the idea of weak submodularity, in this section we propose a new formulation of the sensor selection problem concerned with minimizing the MSE of the Kalman filter that relies on a subset of network nodes to track states of a hidden random process. We first overview concepts that are essential for the development of the proposed framework.</p><p>for all subsets S &#10003; T &#8674; X and j 2 X\T .</p><p>A concept closely related to submodularity is the notion of curvature of a set function. The curvature quantifies how close the function is to being submodular. In particular, here we define the element-wise curvature.</p><p>Definition 3. The element-wise curvature C l of a monotone non-decreasing function f is defined as</p><p>where</p><p>denote the marginal values of adding element i to sets S and T , respectively, and</p><p>A set function is submodular if and only if C max &#63743; 1. We refer to f (S) as being weak submodular if its curvature</p><p>Let X be a finite set and let I denote a collection of subsets of X. The pair M = (X, I) is a matroid if the following two statements hold:</p><p>&#8226; Hereditary property. If T 2 I, then S 2 I for all S &#10003; T .</p><p>&#8226; Augmentation property. If S, T 2 I and |S| &lt; |T |, then there exists e 2 T \S such that S [ {e} 2 I.</p><p>The collection I is called the set of independent sets of the matroid M. A maximal independent set is a basis. It is easy to show that all the bases of a matroid have the same cardinality.</p><p>Given a monotone non-decreasing set function f : 2 X ! R with f (;) = 0, and a uniform matroid M = (X, I), we are interested in solving the combinatorial problem</p><p>Recall that for Kalman filtering in the resource constrained scenario, if S k is the set of sensors selected at time k then the error covariance matrix of the filtered estimate is P k|k = F 1 S k , the inverse of the corresponding Fisher information matrix. Let us define f (S) as</p><p>Clearly, since P k|k 1 is known, there is a one-to-one correspondence between f (S k ) computed for a given subset of sensors S k and the MSE of the LMMSE estimator (i.e., filtered estimate of the Kalman filter) that uses measurements acquired by the sensors in S k . Therefore, we can express the optimization problem (4) as</p><p>We now argue that ( <ref type="formula">9</ref>) is indeed an instance of the general combinatorial problem <ref type="bibr">(8)</ref>. By defining X = [n] and</p><p>In Proposition 1 below we characterize important properties of f (S) and develop a recursive scheme to efficiently compute the marginal gain of querying a sensor. The formula for the marginal gain of f (S) is also of interest in our subsequent analysis of its weak submodularity properties.</p><p>Then, f (S) is a monotonically increasing set function, f (;) = 0, and</p><p>where upon adding element j to S, F S is updated according to</p><p>Proof. See Appendix A. &#8965;</p><p>As stated in Section I, the MSE is not supermodular <ref type="bibr">[18]</ref>, <ref type="bibr">[23]</ref>. Consequently, the proposed objective f (S) =</p><p>is also not submodular. However, as we show in Theorem 1, under certain conditions f (S) is characterized by a bounded maximum element-wise curvature C max . Theorem 1 also states a probabilistic theoretical upper bound on C max in scenarios where at each time step the measurement vectors h k,j 's are realizations of independent identically distributed (i.i.d.) random vectors drawn from a suitable distribution.</p><p>Before proceeding to Theorem 1 and its proof, we first state the matrix Bernstein inequality <ref type="bibr">[29]</ref> and Weyl's inequality <ref type="bibr">[30]</ref> which we will later use in the proof of Theorem 1.</p><p>Lemma 1. (Matrix Bernstein inequality <ref type="bibr">[29]</ref>) Let {X `}n `=1 be a finite collection of independent, random, Hermitian matrices in R m&#8677;m . Assume that for all `2 [n],</p><p>Then, for all q &gt; 0, it holds that</p><p>Lemma 2. (Weyl's inequality <ref type="bibr">[30]</ref>) Let A and B be two m &#8677; m real positive definite matrices. Then it holds that</p><p>where l (A) denotes the l th largest eigenvalue of A.</p><p>We now proceed to the statement and proof of Theorem 1.</p><p>Theorem 1. Let C max be the maximum element-wise curvature of f (S), the objective function of the sensor selection problem. Assume that kh k,j k 2 2 &#63743; C for all j and k. If</p><p>for some 0 &lt; &lt; min (P k|k 1 ), then it holds that</p><p>Furthermore, if h k,j 's are i.i.d. zero-mean random vectors with covariance matrix 2 h I m such that 2 h &lt; C, then for all q &gt; 0, with probability</p><p>it holds that</p><p>Proof. We prove the statement of the theorem by relying on the recursive expression for the marginal gain stated in Proposition 1. We first establish a sufficient condition for weak submodularity of f (S). In particular, from the definition of the element-wise curvature and <ref type="bibr">(10)</ref>, for all (S, T, j) 2 X l we obtain</p><p>where the inequality follows from the Courant-Fischer min-max theorem <ref type="bibr">[30]</ref>. Notice that max (F 1 S ) = min (F S ) 1 and min (F T ) min (F S ) min (F ; ) = min (P 1 k|k 1 ) by Lemma 2. This fact, along with the definition of C max implies</p><p>where (a) follows from the fact that max (F S )</p><p>) and (b) holds since</p><p>is a monotonically increasing function for x &gt; 0. Now, since the maximum eigenvalue of a positive definite matrix satisfies the triangle inequality, we have</p><p>Therefore, by combining inequalities <ref type="bibr">(15)</ref> and ( <ref type="formula">20</ref>) we obtain the result in <ref type="bibr">(16)</ref>.</p><p>Next, to analyze the setting of i.i.d random measurement vectors, we bound max (F [n] ) using Lemma 1. Let</p><p>h I m and Y = P n j=1 X j . To use the result of Lemma 1, one should first verify expressions in <ref type="bibr">(12)</ref>. To this end, note that</p><p>This in turn implies that</p><p>by the linearity of expectation and the triangle inequality. To proceed, we need to determine max (X j ) and E[X 2 j ]. First, let us verify h k,j is an eigenvector of X j by observing that</p><p>where h k,j h &gt; k,j 2 h I m is the corresponding eigenvalue. Since h k,j h &gt; k,j is a rank-1 matrix, other eigenvalues of X j are all equal to</p><p>and we recall that C 2 h &gt; 0. We can now establish an upper bound on</p><p>where we have used the fact that E[X j ] = 0. Thus,</p><p>h . Now, according to Lemma 1, for all q &gt; 0 it holds that Pr{ max (Y) &#63743; q} p where</p><p>Therefore,</p><p>with probability p. This completes the proof. &#8965; Remark 1: The setting of i.i.d. random vectors described in Theorem 1 arises in scenarios where sketching techniques, such as random projections, are used to reduce dimensionality of the measurement equation (see <ref type="bibr">[25]</ref> for more details). The following are often encountered examples of such settings:</p><p>1) Multivariate Gaussian measurement vectors: Let h k,j &#8672; N (0, 1 m I m ) for all j. It is straightforward to show that E[kh k,j k 2  2 ] = 1. Furthermore, it can be shown that kh k,j k 2 2 is with high probability concentrated around its expected value. Therefore, for this case, </p><p>and let</p><p>be the condition number of P k|k 1 . Then, following some elementary numerical approximations, we obtain the following corollary.</p><p>for some c 1 &gt; 1. Then with probability</p><p>it holds that C max &#63743; 3 for some c 2 &gt; 0.</p><p>Informally, Theorem 1 states that for a well-conditioned P k|k 1 the curvature of f (S) is small, which implies weak submodularity of f (S). Furthermore, the probability of such an event exponentially increases with the number of available measurements.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IV. RANDOMIZED GREEDY SENSOR SELECTION</head><p>The complexity of SDP relaxation and greedy algorithms for sensor selection become prohibitive in large-scale systems. Motivated by the need for practically feasible schemes, we present a randomized greedy algorithm for finding an approximate solution to (9) and derive its performance guarantees. In particular, inspired by the technique in <ref type="bibr">[13]</ref> proposed in the context of optimizing submodular objective functions, we develop a computationally efficient randomized greedy algorithm (see Algorithm 1) that finds an approximate solution to <ref type="bibr">(9)</ref> with a guarantee on the achievable MSE performance of the Kalman filter that uses only the observations of the selected sensors. Algorithm Algorithm 1 Randomized Greedy Sensor Scheduling 1: Input:</p><p>4: for i = 0, . . . , K 1 </p><p>Set S</p><p>8:</p><p>1 performs the task of sensor scheduling in the following way. At each iteration of the algorithm, a subset R of size s is sampled uniformly at random and without replacement from the set of available sensors. The marginal gain provided by each of these s sensors to the objective function is computed using <ref type="bibr">(10)</ref>, and the one yielding the highest marginal gain is added to the set of selected sensors. Then the efficient recursive formula in ( <ref type="formula">11</ref>) is used to update F 1 S so it can be analyzed when making the selection in the next iteration. This procedure is repeated K times.</p><p>Remark 2: The parameter &#9999; in Algorithm 1, e K &#63743; &#9999; &lt; 1, is a predefined constant that is chosen to strike a desired balance between performance and complexity. When &#9999; = e K , each iteration includes all of the non-selected sensors in R and Algorithm 1 coincides with the conventional greedy scheme. However, as &#9999; approaches 1, |R| and thus the overall computational complexity decreases.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Performance analysis of the proposed scheme</head><p>In this section we analyze Algorithm 1 and in Theorem 2 provide a bound on the performance of the proposed randomized greedy scheme when applied to finding an approximate solution to maximization problem <ref type="bibr">(9)</ref>.</p><p>Before deriving the main result, we first provide two lemmas. Lemma 3 states an upper bound on the difference between the values of the objective function corresponding to two sets having different cardinalities while Lemma 4 provides a lower bound on the expected marginal gain. Lemma 3. Let {C l } n 1 l=1 denote the element-wise curvatures of f (S). Let S and T be any subsets of sensors such that S &#8674; T &#10003; [n] with |T \S| = r. Then it holds that</p><p>where</p><p>k be the set of selected sensors at the end of the i th iteration of Algorithm 1. Then</p><p>where O k is the set of optimal sensors at time k, (i + 1) s is the index of the selected sensor at the (i + 1) st iteration, </p><p>where c = max{C max , 1}, e K &#63743; &#9999; &lt; 1, and = 1 + max{0, s 2n 1 2(n s) }. Furthermore, the computational complexity of Algorithm 1 is O(nm 2 log( 1 &#9999; )) where n is the total number of sensors and m is the dimension of x k .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Proof. Consider S (i)</head><p>k , the set generated by the end of the i th iteration of Algorithm 1. Employing Lemma 3 with S = S (i)</p><p>k , and using monotonicity of f , yields</p><p>where</p><p>Applying the law of total expectation yields</p><p>Using the definition of the maximum element-wise curvature, we obtain</p><p>It is easy to verify, e.g., by taking the derivative, that g(r) is decreasing (increasing) with respect to r if C max &lt; 1</p><p>Hence,</p><p>Using an inductive argument and due to the fact that f (;) = 0, we obtain</p><p>Finally, using the fact that (1+x) y &#63743; e xy for y &gt; 0 and the easily verifiable fact that e ax &#63743; 1+axe a for 0 &lt; x &lt; 1,</p><p>To take a closer look at computational complexity of Algorithm obtained by forming an estimate using sensors selected by the randomized greedy algorithm is within a factor of the optimal MSE. &#9999; c . Let MSE S k denote the mean-square estimation error obtained by forming an estimate using information provided by the sensors selected by Algorithm 1 at time k, and let MSE o be the optimal mean-square error formed using information collected by the sensors specified by the optimum solution of <ref type="bibr">(9)</ref>. Then the expected MSE S k is bounded as</p><p>Remark 3: Since the proposed sensor selection scheme is a randomized algorithm, the analysis of its expected MSE, as provided by Theorem 2 and Corollary 2.1, is a meaningful performance characterization. Notice that, as expected, &#8629; is decreasing in both c and &#9999;. If f (S) is characterized by a small curvature, then f (S) is nearly submodular and the randomized greedy algorithm delivers a near-optimal sensor scheduling. As we decrease &#9999;, &#8629; increases which in turn leads to a better approximation factor. Moreover, by following an argument similar to that of the classical analysis in <ref type="bibr">[17]</ref>, one can show that the approximation factor for the greedy algorithm is given by</p><p>1 c (see also <ref type="bibr">[22]</ref>, <ref type="bibr">[27]</ref>). Therefore, the term &#9999; c in &#8629; denotes the difference between the approximation factors of the proposed randomized greedy algorithm and the conventional greedy scheme.</p><p>Remark 4: The computational complexity of the greedy method for sensor selection that finds marginal gains via the efficient recursion given in Proposition 1 is O(Knm 2 ). Hence, our proposed scheme provides a reduction in complexity by K/ log( <ref type="formula">1</ref>&#9999; ) which may be particularly beneficial in large-scale networks, as illustrated in our simulation results.</p><p>Remark 5: In contrast to the results of <ref type="bibr">[13]</ref> derived in the context of maximizing monotone submodular functions, Theorem 2 relaxes the submodularity assumption and states that the randomized greedy algorithm does not require submodularity to achieve near-optimal performance. Rather, if the set function is weak submodular, Algorithm 1 still selects a subset of sensors that provide an MSE near that achieved by the optimal subset of sensors. In addition, even if the function is submodular (e.g., if we use the log det objective instead of the MSE), the results of Theorem (2) offer an improvement over the theoretical results of <ref type="bibr">[13]</ref> due to a tighter approximation bound stemming from the analysis presented in the proof of Theorem (2). Moreover, a major assumption in <ref type="bibr">[13]</ref> is that R is constructed by sampling with replacement. Clearly, this contradicts the fact that a sensor selected in one iteration will not be in R in the subsequent iteration with probability one. On the contrary, we assume R is constructed by sampling without replacement and carry out the analysis in this setting that matches the actual randomized greedy sensor selection strategy.</p><p>The randomized selection step of Algorithm 1 can be interpreted as an approximation of the marginal gains of the selected sensors using a greedy scheme <ref type="bibr">[3]</ref>. More specifically, for the i th iteration it holds that f jrg (S</p><p>, where subscripts rg and g refer to the sensors selected by the randomized greedy (Algorithm 1) and the greedy algorithm, respectively, and <ref type="foot">1</ref> In view of this argument, we obtain Theorem 3 which states that if f (S) is characterized by a bounded maximum element-wise curvature and {&#8984; (i) k } K i=1 are independent random variables, Algorithm 1 returns a subset of sensors yielding an objective that with high probability is only a multiplicative factor away from the objective achieved by the optimal schedule. Theorem 3. Instate the notation and assumptions of Theorem 2. Let {&#8984;</p><p>k } K i=1 are independent, then for all 0 &lt; q &lt; 1 with probability at least 1 e CK , it holds that</p><p>for some C &gt; 0.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Proof. Consider S (i)</head><p>k , the set generated by the end of the i th iteration of Algorithm 1 and let (i + 1) g and (i + 1) rg denote the sensors selected by the greedy and randomized greedy algorithm in the i th iteration, respectively. Let c = max{C max , 1}. Employing Lemma 3 with S = S (i)</p><p>k , and using monotonicity of f , yields</p><p>Using the fact that</p><p>for all j, we obtain</p><p>On the other hand, f (S</p><p>Combining ( <ref type="formula">50</ref>) and (51) yields</p><p>Using an inductive argument similar to the one in the proof of Theorem 2, and noting that f (;) = 0,</p><p>1 e</p><p>where to obtain (a) we use the fact that (1 + x) y &#63743; e xy for y &gt; 0. Therefore, since by assumption `min</p><p>k &#63743; 1, we establish (46). To show the second statement, i.e., prove (47) holds in the setting of independent {&#8984; (i) k } K i=1 , we apply the Bernstein's inequality <ref type="bibr">[31]</ref> to the sum of independent random variables</p><p>k } are bounded random variables, from Popoviciu's inequality <ref type="bibr">[31]</ref> for all i 2 [K], it follows that</p><p>Hence, based on the Bernstein's inequality, for all 0 &lt; q &lt; 1 Pr{</p><p>where</p><p>where (b) follows because p increases as we replace &#181; i (&#9999;) and `i(&#9999;) by their lower bounds. Finally, substituting this results in (53) yields</p><p>with probability at least 1 e C(&#9999;,q)K . This completes the proof. &#8965;</p><p>Our simulation studies in Section V empirically confirm the results of Theorems 2 and 3, and illustrate that Algorithm 1 performs favorably compared to the competing schemes both on average as well as for each individual sensor scheduling task.</p><p>Similar to Corollary 2.1, we can now obtain a probabilistic bound on the MSE (3) achievable at each time step using the proposed randomized greedy algorithm. This result is stated in Corollary 3.1 below.</p><p>Corollary 3.1. Consider the notation and assumptions of Corollary 2.1 and Theorem 3. Let 0 &lt; q &lt; 1 and define &#8629; = 1 exp( (1 q)&#181;min(&#9999;) c</p><p>). Then, with probability at least 1 e CK it holds that</p><p>for some C &gt; 0.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>V. SIMULATION RESULTS</head><p>To test the performance of the proposed randomized greedy algorithm, we compare it with the classic greedy algorithm and the SDP relaxation in a variety of settings as detailed next. We implemented the greedy and randomized greedy algorithms in MATLAB and the SDP relaxation scheme via CVX <ref type="bibr">[32]</ref>. All experiments were run on a laptop with 2.0 GHz Intel Core i7-4510U CPU and 8.00 GB of RAM.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Kalman filtering in random sensor networks</head><p>We first consider the problem of state estimation in a linear time-varying system via Kalman filtering. For simplicity, we assume the state transition matrix to be identity, i.e., A k = I m . At each time step, the measurement vectors, i.e., the rows of the measurement matrix H k , are drawn according to N &#8672; (0, 1 m I m ). The initial state is a zero-mean Gaussian random vector with covariance &#8963; x = I m ; and the process and measurement noise are zero-mean Gaussian with covariance matrices Q = 0.05I m and R = 0.05I n , respectively.</p><p>The MSE of the filtered estimator and running time of each scheme is averaged over 100 Monte-Carlo simulations.</p><p>The time horizon for each run is T = 10.</p><p>We first consider a system having state dimension m = 50 and the total number of sensors n = 400. We set a constraint on the number of sensors allowed to be queried at each time step to K = 55 and compare the MSE achieved by each sensor selection method over the time horizon of interest. For the randomized greedy algorithm we set &#9999; = 0.001. Fig. <ref type="figure">2</ref> shows that the greedy method consistently yields the lowest estimation MSE while the MSE provided by the randomized greedy algorithm is slightly higher. The MSE performance achieved by solving the SDP relaxation is considerably larger than those of the greedy and randomized greedy algorithms. The time it takes each method to select K sensors is given in Table <ref type="table">I</ref>. Both the greedy algorithm and the randomized greedy algorithm are much faster than the SDP formulation. Moreover, the randomized greedy scheme is nearly two times faster than the greedy method. Note that, in this example, in each iteration of the sensor selection procedure the randomized scheme only computes the marginal gain for a sampled subset of size 50. In contrast, the classic greedy approach computes the marginal gain for all 400 sensors. In summary, the greedy method yields slightly lower MSE but is much slower than the proposed randomized greedy algorithm.</p><p>To study the effect of the number of selected sensors on the MSE performance, we vary K from 55 to 115 with increments of 10. The MSE values at the last time step for each algorithm are shown in Fig. <ref type="figure">3(a)</ref>. As the number of selected sensors increases, the estimation becomes more accurate, as reflected by the MSE of the estimates provided by each algorithm. Moreover, the differences between the MSE values achieved by different schemes monotonically decrease as more sensors are selected. The sensor selection running times shown in Fig.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>3(b)</head><p>indicate that the randomized greedy scheme is nearly twice as fast as the greedy method, while the SDP method is orders of magnitude slower than both greedy and randomized greedy algorithms.</p><p>Finally, to empirically verify the results of Theorem 3, in Fig. <ref type="figure">4</ref> we compare histograms of MSE achieved by the greedy and the proposed randomized greedy sensor selection schemes with various choices of &#9999; when K = 60. As the figure shows, the MSE of sets selected by the proposed scheme is relatively close to that selected by state-ofthe-art greedy algorithm. In addition, as &#9999; decreases, the MSE of the randomized greedy algorithm approaches that of the greedy algorithm. These empirical observations coincide with our theoretical results in Theorem 3. That is, the proposed algorithm, although a randomized scheme, returns a near-optimal subset of sensors for each individual   sensor selection task.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. State estimation in large-scale networks</head><p>Next, we compare the performance of the randomized greedy algorithm to that of the greedy algorithm as the size of the system increases. We run both methods for 20 different system dimensions. The initial dimensions are set to m = 20, n = 200, and K = 25 and all three parameters are scaled by where varies from 1 to 20. In addition, to evaluate the effect of &#9999; on the performance and runtime of the randomized greedy approach, we repeat experiments for &#9999; 2 {0.1, 0.01, 0.001}. Note that the computational complexity of the SDP relaxation scheme is prohibitive in this setting and hence it is omitted. As the figure illustrates, the gap between the running times grows with the size of the system and the randomized    greedy algorithm performs nearly 28 times faster than the greedy method for the largest network. Fig. <ref type="figure">5</ref> shows that using a smaller &#9999; results in a lower MSE while it slightly increases the running time. These results suggest that, for large systems, the randomized greedy provides almost the same MSE while being much faster than the greedy algorithm.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Accelerated multi-object tracking</head><p>Finally, we study the multi-object tracking application introduced in Section II-A. Specifically, we consider a scenario where twenty moving objects are initially uniformly distributed in a 5 &#8677; 10 area. At each time instance, the objects move in a random direction with a constant velocity set to 0.2. Twenty UAVs, equidistantly spread over the area, move according to a periodic parallel-path search pattern <ref type="bibr">[33]</ref>. The initial phases of the UAVs' = Tr</p><p>where (a) is obtained by applying matrix inversion lemma (Sherman-Morrison formula) <ref type="bibr">[30]</ref> to (F S + 2 j h k,j h &gt; k,j ) 1 , and (b) follows from the properties of the matrix trace operator. Finally, since F S is a symmetric positive definite matrix, f j (S) &gt; 0 which in turn implies monotonicity. (62) Note that (62) is established for a specific ordering of elements in T \S. Given an ordering {j 1 , . . . , j r }, one can form a set P = {P 1 , . . . , P r } of r permutations (e.g., by defining the right circular-shift operator P t ({j 1 , . . . , j r }) = {j r t+1 , . . . , j 1 , . . . } for 1 &#63743; t &#63743; r) such that P p (j) 6 = P q (j) for p 6 = q and 8j 2 T \S; (62) holds for each such permutation. By summing the corresponding r inequalities we obtain</p><p>Rearranging (63) yields the desired result.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>APPENDIX C</head><p>PROOF OF LEMMA 4</p><p>First, we aim to bound the probability of an event that a random set R contains at least one index from the optimal set of sensors which is a necessary condition to reach the optimal MSE. Let us consider S </p><p>where H p is the p th harmonic number, is the Euler-Mascheroni constant, and</p><p>) is a monotonically decreasing sequence related to Hurwitz zeta function <ref type="bibr">[34]</ref>. Therefore, using the identity (65) we obtain</p><p>&#63743; ((1</p><p>where (c) follows since &#8675; n &#8675; n s = 1 </p><p>by the definition of s and the fact that 1 e s n x is a concave function. Finally, according to Lemma 2 in <ref type="bibr">[13]</ref>,</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>j + max (P k|k 1 )x</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_1"><p>Notice that `i(&#9999;) and &#181; i (&#9999;) are time-varying quantities where the time index is omitted for the simplicity of notation.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_2"><p>Without a loss of generality, we assume that s is an integer.</p></note>
		</body>
		</text>
</TEI>
