<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Cooperative Visual-Inertial Odometry</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>05/30/2021</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10298613</idno>
					<idno type="doi"></idno>
					<title level='j'>International Conference on Robotics and Automation</title>
<idno></idno>
<biblScope unit="volume"></biblScope>
<biblScope unit="issue"></biblScope>					

					<author>P. Zhu</author><author>Y. Yang</author><author>W. Ren</author><author>G. Huang</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[This paper studies the problem of multi-robot cooperative visual-inertial localization where each robot is equipped with only a single camera and IMU. We develop two cooperative visual-inertial odometry (C-VIO) algorithms within the multi-state constraint Kalman filter (MSCKF) framework, in which each robot utilizes not only its own measurements but constraints of common features co-observed with its neighbors within the current sliding window in order to improve the localization accuracy. The first centralized-equivalent algorithm tracks the robot-to-robot cross correlations and prioritizes the pose accuracy while requiring full capacity communication among all the robots during update. The second distributed algorithm ignores the robot-to-robot cross correlations to obtain a scalable, robust and efficient fully distributed structure where each robot only keeps its own states and communicates with its neighbors, while a covariance intersection (CI)-based update strategy is leveraged to guarantee consistency. The proposed algorithms are validated extensively in both Monte-Carlo simulations and real-world datasets, and shown to be able to achieve better accuracy with competitive efficiency.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. INTRODUCTION AND RELATED WORK</head><p>Multi-robot autonomous systems with the ability of communication and perception can collaborate to accomplish missions (e.g., area surveillance, environmental monitoring and disastrous rescue) more efficiently and robustly than a single robot. To successfully accomplish these tasks, robots are required to have robust high-precision 3D localization, in particular, when navigating GPS-denied environments. One promising solution is the visual-inertial navigation system (VINS), which is to fuse both the measurements from cameras and inertial measurement units (IMUs) <ref type="bibr">[1]</ref>. These sensors are cheap and light-weight but they are complementary and able to provide rich environmental information, hence enabling highly-accurate motion estimation. To date, many VINS algorithms have been developed <ref type="bibr">[2]</ref>- <ref type="bibr">[7]</ref>, and among them, multi-state constraint Kalman filter (MSCKF) <ref type="bibr">[5]</ref> is arguably one of the most popular ones. However, most of current VINS are designed for the case of a single robot.</p><p>One intuitive strategy to localize a group of robots is to let each member run a single-robot VINS algorithm independently. However, additional geometric constraints (e.g., common feature observations, relative robot-to-robot measurements) can be explored in multi-robot systems to improve the localization performance, if robots communicate with each other. Then, it holds great potential to design cooperative VINS (C-VINS) algorithms for multi-robot systems. One of the challenges for developing a C-VINS algorithm, especially {yuyang,ghuang}@udel.edu for a large robot team, is the stringent resource limitations associated with communication bandwidth, battery power and computational capacity. Therefore, it is highly desirable to design a distributed algorithm which is scalable, robust and efficient for multi-robot systems.</p><p>Relative robot-to-robot pose or distance observations have been utilized to perform cooperative localization (CL) for a robot group <ref type="bibr">[8]</ref>- <ref type="bibr">[11]</ref>. Roumeliotis et al. <ref type="bibr">[8]</ref> proposed an algorithm that distributes the computation of a centralized EKF to each robot. Though each robot propagates distributively, all-to-all robot communication is required in the update steps. To relax the communication requirements, Lukas et al. <ref type="bibr">[9]</ref> approximated the robot-to-robot covariances instead of tracking the exact values. However, the filter's consistency can not be guaranteed. Carrillo et al. <ref type="bibr">[10]</ref> introduced a consistent algorithm based on covariance intersection (CI) where each robot only estimates its own state and covariance while robot-to-robot cross correlations are ignored. However, these works address CL only in 2D where each robot's state only contains the planar position and heading.</p><p>Recently, relative robot-to-robot bearing observations are used in <ref type="bibr">[12]</ref>, <ref type="bibr">[13]</ref> to localize a group of two robots in 3D environments. Martinelli et al. <ref type="bibr">[12]</ref> provided an analytic solution to determine the relative state between two robots by establishing a Polynomial Equation System (PES). All the singularities and minimal cases that could harm the estimation performance were analyzed in <ref type="bibr">[13]</ref>. However, the results are limited to the two-robot case using synchronized-calibrated cameras and unbiased inertial measurements. Nguyen et al. <ref type="bibr">[14]</ref> introduced the Coupled Probabilistic Data Association Filter (CPDAF) to estimate the robots' poses in a common frame using relative robot-to-robot bearing and distance measurements. But the algorithm is centralized and is only validated in simulation environments. In addition, <ref type="bibr">[8]</ref>- <ref type="bibr">[14]</ref> all rely on relative robot-to-robot measurements which are difficult to achieve in many realistic scenarios due to robot sensing ability and environmental occlusion.</p><p>Alternatively, environmental common features can be used to improve the CL performance <ref type="bibr">[15]</ref>- <ref type="bibr">[18]</ref>. Cooperative SLAM (C-SLAM) algorithms enable multiple robots to build a joint map which is used to perform CL simultaneously. Paull et al. <ref type="bibr">[15]</ref> developed a graph-based C-SLAM algorithm for autonomous underwater vehicles (AUVs) among which communications using acoustics is challenging. Lajoie et al. <ref type="bibr">[16]</ref> introduced a fully distributed C-SLAM algorithm with a distributed outlier rejection for incorrect loop closures. The robots' estimated trajectories are retrieved through a distributed graph optimization. However, C-SLAM may become computational and memory expensive as the map size increases, which could be prohibitive in resource-constrained platforms. Being closest to our work, Melnyk et al. <ref type="bibr">[17]</ref> introduced the CL-MSCKF algorithm, which utilized the common feature measurements to improve the VIO accuracy, without building a map. The authors performed observability analysis to determine the minimum common features needed for computing the relative transformation between two robots. However, the CL-MSCKF follows a centralized formulation with only simulation results where all robots' IMU and camera measurements are collected and processed in a fusion center to estimate all the robots' states.</p><p>In this paper, we propose a fully distributed multi-robot pose estimation algorithm with a new centralized-equivalent counterpart as the benchmark. Commonly observed features from each robot are exploited to improve the CL accuracy. The proposed C-VIO algorithms are based on the state-ofthe-art OpenVINS <ref type="bibr">[19]</ref>, possessing advanced features such as full online calibration (including camera intrinsic, camerato-IMU spatiotemporal parameters) and First-Estimates Jacobian (FEJ) treatments <ref type="bibr">[20]</ref> to improve estimation consistency. In particular, the main contributions of this paper include:</p><p>&#8226; We design a fully distributed cooperative (DISC)-VIO algorithm based on CI where each robot only maintains and estimates its own states by communicating with its neighbors. DISC-VIO is scalable, robust, and computationally and communicationally efficient while still improving the localization performance compared to a single robot system. &#8226; We design a centralized-equivalent cooperative (CEC)-VIO algorithm as the benchmark, where, in contrast to <ref type="bibr">[17]</ref>, only camera measurements are processed centrally while each robot propagates and clones poses distributedly. CEC-VIO achieves equivalent accuracy as the centralized one in <ref type="bibr">[17]</ref>. &#8226; Extensive Monte-Carlo simulations and real-world datasets are used to validate the proposed C-VIO algorithms in a variety of environments.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>II. PROBLEM FORMULATION</head><p>We define a robot set of N robots as V. Each robot i (i &#8712; V) is mounted with a monocular VI sensor that measures the robot's self-motion and observes environmental features. The tracked features are divided into common features (observed by more than one robot) and independent features (observed by only one robot). We propose, respectively, a centralizedequivalent and a fully distributed C-VIO algorithm to estimate all robots' states by leveraging the MSCKF <ref type="bibr">[21]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Visual-Inertial State</head><p>For any robot i, the state vector x i consists of the current inertial navigation state x Ii , a sliding window of c historical IMU clones x Ci , and robot i's camera intrinsic and camerato-IMU spatial/temporal parameters x Wi</p><p>x Ii =</p><p>where</p><p>G q is the JPL unit quaternion <ref type="bibr">[22]</ref> parameterizing the rotation</p><p>k are the position and velocity of the IMU expressed in the global frame, and b &#969; i,k and b a i,k are the IMU's gyroscope and accelerometer biases, respectively. The constant calibration parameters x Wi includes the time-offset Ci t Ii between robot i's camera and IMU clocks, the spatial transformations { Ci Ii q, Ci p Ii } from the IMU frame to the camera frame, and the camera intrinsics &#950; i containing the focal lengths, the position of the principal point and the distortion parameters.</p><p>To perform EKF estimation, we use a minimal 3dimensional representation &#948;&#952; to parameterize the orientation error as q &#948;&#952; 2 1</p><p>&#8855; q where &#8855; denotes quaternion multiplication <ref type="bibr">[22]</ref> and &#8226; denotes the estimated value. While for all the other variables in the state vector, we use the standard additive error definition (e.g., G p Ii = G pIi + G pIi with &#8226; denoting the error).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. IMU and Visual Feature Measurements</head><p>Each robot i's IMU measures the angular velocity, Ii &#969; m , and the linear acceleration, Ii a m as follows</p><p>where Ii &#969; and Ii a denote the IMU's true angular velocity and the true linear acceleration, n &#969;i and n ai are the corresponding continue-time Gaussian white noise, and G g is the gravitational acceleration.</p><p>Let us consider the observation of a feature f obtained by robot i's camera at time step k. Note that as time-offset inevitably exists between the IMU and camera clocks, the time step k expressed in robot i's camera clock, Ci t k , differs from the same instant expressed in its IMU clock, Ii t k . These two time instants are related by the time-offset</p><p>The perspective projection measurement function for this feature is given by</p><p>where n i is the raw pixel noise assumed to be zero-mean Gaussian, w i (&#8226;) is the function mapping the normalized image coordinates onto the image plane according to the camera model (e.g., radial-tangential or fisheye <ref type="bibr">[23]</ref>) and the corresponding intrinsics</p><p>is the feature's position expressed in robot i's camera frame at timestamp Ii t k , which is related to the normalized image coordinates through the projection function</p><p>where G p f is the feature's global position which is time invariant and unknown.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>III. CENTRALIZED-EQUIVALENT ALGORITHM</head><p>In this section, we present the CEC-VIO which serves as the benchmark for the ensuing DISC-VIO.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Propagation and Stochastic Cloning</head><p>When any robot i's camera captures the k-th image with timestamp Ci t k , by using robot i's collected IMU measurements ( <ref type="formula">5</ref>) and ( <ref type="formula">6</ref>) over the time window [ Ii t k-1 , Ii t k ] (the corresponding time interval expressed in the IMU clock from time step k -1 to k), which is denoted as I i , we propagate the current inertial navigation state x Ii at time step k -1 to the next time step k based on the IMU dynamics <ref type="bibr">[24]</ref>:</p><p>where n Ii is the stacked vector of robot i's IMU measurement noises. The subscript k|k -1 for xI i,k|k-1 denotes the predicted estimate at time step k given the measurements up to time step k -1. Note that the clones x Ci and the calibration parameters x Wi do not evolve over time.</p><p>Based on the IMU dynamics, we can compute the statetransition matrix &#934; i and the discrete noise covariance Q i across the time interval</p><p>. The corresponding entries for the non-evolving state x Ci and x Wi in &#934; i and Q i are identity and zero, respectively. Then, robot i's state covariance is propagated as</p><p>It is important to note that once robot i has used another robot j's measurements of the common features to update its own state x i , these two robots' states are correlated afterwards. So in general, we have</p><p>To achieve a distributed propagation step, we split the cross-covariance according to <ref type="bibr">[8]</ref>, <ref type="bibr">[9]</ref> </p><p>where the factor &#963; ij maintained at robot i equals to P ij , and the factor &#963; ji maintained at robot j is identity. Then, robot i can propagate the cross-covariance to all the other robots by updating the factors</p><p>After propagating to time step k, the state vector is augmented with an estimate of the IMU pose at the true time Ii t k , which will be a clone state in x Ci . This augmented state can be written as a function g(x i ) of the current state vector <ref type="bibr">[25]</ref>. We can compute the state Jacobian J i from g(x i ) and then augment robot i's current covariance by <ref type="bibr">[26]</ref> </p><p>Note that we also need to augment the cross-covariance</p><p>. Later, the cross correlations that are needed for updating the estimate will be reproduced using these factors when an update is triggered.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Update</head><p>To check if there exist common features, the extracted features' measurements with their descriptors from each robot's k-th image are collected at a centering robot, where runs a centralized update. For an independent feature, if it is lost or reaches the maximum track length (the sliding window size), we perform MSCKF update. While to optimally utilize the common feature measurements, MSCKF update is triggered when a common feature is lost in all the associated robots, or reaches the maximum track length of any associated robot, or contains the oldest clone in any associated robot's state.</p><p>Note that all the robots' states will be updated either using common features or independent features, as the robots' states are correlated. When the update is triggered for a feature, all the robots' states, covariances and cross-covariance factors together with this feature's measurements are collected at the centering robot. The current cross correlations are reproduced by</p><p>where i, j &#8712; V and j = i. Then, by using all the measurements and the corresponding clones, we triangulate the feature's global position G p f and compute the stacked vector of measurement residuals, r, associated with this feature based on the measurement model ( <ref type="formula">8</ref>), <ref type="bibr">(9)</ref>. The linearized residual system can be written as</p><p>where <ref type="figure">H</ref> x and H f denote the corresponding stacked Jacobians with respect to the state and the feature, and n is the stacked noise vector. To avoid storing the feature's position in our state, we project system <ref type="bibr">(17)</ref> onto the nullspace of feature Jacobian H f . With matrix N whose columns form a basis of this nullspace, we can have:</p><p>Equation ( <ref type="formula">19</ref>) can be directly used to perform the EKF update. After that, the new cross correlation P ij,k|k are decomposed using (13) again. The resulting factors together with the states and covariances will be sent back to the corresponding robots.</p><p>IV. FULLY DISTRIBUTED ALGORITHM In this section, we present the DISC-VIO in which the robot-to-robot cross-correlations are conservatively dropped off via covariance intersection.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Propagation and Stochastic Cloning</head><p>When any robot i receives the k-th image with timestamp Ci t k , robot i's IMU measurements of the angular velocity, Ii &#969; m , and the linear acceleration, Ii a m , over the time interval [ Ii t k-1 , Ii t k ] are utilized to propagate the state and the covariances from time step k -1 to k based on <ref type="bibr">(11)</ref> and <ref type="bibr">(12)</ref>. After that, an estimate of the current IMU pose is appended to the state vector as a clone state and the covariance is augmented as <ref type="bibr">(15)</ref>. Each robot performs these two steps individually.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Update</head><p>When any robot i's camera captures the k-th image, it receives the extracted features' descriptors of the latest images from its communication neighbors. Feature matching is performed in pair between robot i's k-th image features and the received ones to check if there exist common features. Then, the features tracked at robot i can be classified as an independent feature or a common feature. If a feature is a common feature, robot i will store its neighbors' measurements.</p><p>When a feature is lost or reaches its maximum track length at robot i, we trigger the update. If it is an independent feature, the standard single-robot MSCKF update <ref type="bibr">[21]</ref> is performed. Note that if this feature is a common feature at robot i, it could have been tracked across multiple frames in more than two robots. Let N f be a robot set in which this feature is tracked. So i &#8712; N f . Further, we denote {j 1 , &#8226; &#8226; &#8226; , j L } as a subset of N f that excludes robot i. Additionally, we let n (&#8226;) be the corresponding stacked noise vector for a stacked residual system. To process this common feature, robot i receives the associated clones, states and the corresponding covariances from robot j l (l = 1, &#8226; &#8226; &#8226; , L). By using all available measurements from the robots in N f , we triangulate the feature's global position. After that, we use robot i's measurements to compute a stacked linearized residual system</p><p>We perform Givens rotations <ref type="bibr">[27]</ref> to zero-out rows in H i,f with indices larger than 3, and apply the same Givens rotations to H i,x and r i,k . Then, system ( <ref type="formula">20</ref>) is split into a subsystem that depends on the feature's position and another subsystem that does not.</p><p>Similarly, we can use robot j l 's measurements to compute a stacked linearized residual system</p><p>which is separated into the following two subsystems by performing Givens rotations on H j l ,f , r j l ,k and H j l ,x .</p><p>Note that the bottom reduced system in ( <ref type="formula">21</ref>) is only associated with robot i's state, which can be directly used to update robot i's estimate by using the EKF. Stacking the top systems in ( <ref type="formula">21</ref>) and ( <ref type="formula">23</ref>) for all j l (l = 1, &#8226; &#8226; &#8226; , L), we obtain a new system that dependents on the feature's position.</p><p>where Diag{&#8226;} denotes the block-diagonal matrix constructed from the elements. Then, we project system <ref type="bibr">(24)</ref> onto the nullspace of Hf and obtain a system that is independent of G pf .</p><p>which contains additional constrains of the common feature.</p><p>Here, the noise vector n i is assumed to be zero-mean Gaussian with covariance R i . It is important to note that performing the standard EKF update using (25) will yield an inconsistent estimate which degrades the performance seriously, since xi and xj l (l = 1, &#8226; &#8226; &#8226; , L) are in general correlated with unknown correlations. To guarantee consistency, we adopt the CI algorithm <ref type="bibr">[28]</ref>:</p><p>where the left side is the CI covariance with zero offdiagonal elements and the right hand side is the unknown <ref type="formula">26</ref>) into the EKF using system (25), we obtain the following equations to update robot i' estimate.</p><p>where &#948;x i is the correction to the state estimate xi .</p><p>V. EXPERIMENTAL RESULTS We compared our algorithms to the case where each robot performs independent MSCKF using a single camera and IMU. We represent this case as OpenVINS for simplicity. Compared with the proposed CEC-VIO and DISC-VIO, the only difference of settings in OpenVINS is that no common features are detected and used. Note that we do not include any SLAM features <ref type="bibr">[29]</ref>, the features that can be tracked beyond the clone window, in the state vector. To fully test the proposed algorithms, we first performed extensive Monte-Carlo simulations on the synthetic data generated from realistic trajectories in three different scenarios. Following that, a real-world test was performed.</p><p>The weights in CI <ref type="bibr">(26)</ref> are usually chosen by minimizing the uncertainty of the posterior covariance <ref type="bibr">[10]</ref>, <ref type="bibr">[28]</ref>. In DISC-VIO, one can gain these weights by minimizing the trace of P i,k|k in (28) at every update step. This optimization problem is non-convex and could be time consuming as the size of N f increases. Through extensive simulations and the real-world test, we find that fixed weights in DISC-VIO work well and these weights are easy to choose. To speed up the process of DISC-VIO and demonstrates its applicability in resource-constrained platforms, we test DISC-VIO with fixed weights. In both simulations and the real-world tests, for any robot i running DISC-VIO, we let &#969; j l = 0.008 (l = 1, &#8226; &#8226; &#8226; , L) and then &#969; i = 1l &#969; j l . For example, when updating robot 0's estimate using a common features that is also tracked by robot 1 and robot 2, we have &#969; 1 = &#969; 2 = 0.008 and &#969; 0 = 0.984.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Monte-Carlo Simulations</head><p>We simulated a team of three robots in different environments depicted in Fig. <ref type="figure">1</ref>. In each group, one robot follows a realistic trajectory which is used to create the trajectories of the other two robots by adding orientation and position offsets. The IMU and camera measurements are generated from these simulated trajectories. The first simulated dataset, termed as "EuRoC MH5", is based on the 97 meters "Machine Hall 05" dataset from EuRoC MAV datasets <ref type="bibr">[30]</ref>. The second simulated dataset is called "Udel gore" based on a 240 meters dataset which traverses three floors in the University of Delaware's Gore Hall <ref type="bibr">[19]</ref>. The third dataset is called "Tum corridor" which is highly dynamic and based on the 295 meters "Tum Corridor 1" dataset from the TUM datasets <ref type="bibr">[31]</ref>. To ensure a fair comparison, the same parameters summarized in Table <ref type="table">I</ref> were used in all algorithms for three robots.</p><p>We performed 50 Monte-Carlo runs on each simulated dataset. The Root Mean Square Error (RMSE) and Normalized Estimation Error Squared (NEES) results averaged over all runs for robot 0 (R0) are given in Fig. <ref type="figure">1</ref>. Results for robots 1 (R1) and 2 (R2) are similar and omitted here. Table <ref type="table">II</ref> provides the RMSE results averaged over all runs and time steps for three robots. Fig. <ref type="figure">1</ref> and Table <ref type="table">II</ref> show that the proposed CEC-VIO and DISC-VIO outperform OpenVINS in all three simulated scenarios. Especially for CEC-VIO, the errors can be reduced by more than half. As shown on the right in Fig. <ref type="figure">1</ref>, DISC-VIO has the smallest NEES as expected, since CI makes the estimate more conservative.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Real-World Tests</head><p>We further evaluated the proposed CEC-VIO and DISC-VIO in the room scenarios from the TUM VI datasets <ref type="bibr">[31]</ref> which provide 20 Hz stereo image (only the left image is leveraged in the test), 200 Hz IMU measurements and the accurate pose ground truths from a motion capture system for entire trajectories. We simultaneously read three bags collected in the same room to mimic a three-robot cooperative case. Three robots' trajectories shown in Fig. <ref type="figure">2</ref> are 146, 135 and 131 meters long, respectively.</p><p>We initialized three robots' inertial navigation states with the corresponding pose ground truths, zero velocities and zero gyroscope and accelerometer biases. All extrinsics and  intrinsics were initialized with the values provided in the datasets and each robot's camera-to-IMU time offsets were set to zeros. Features were uniformly extracted using FAST <ref type="bibr">[32]</ref> and tracked for each robot's image stream or matched across different robots' images using ORB <ref type="bibr">[33]</ref> with 8point RANSAC to reject outliers. Up to 150 features were extracted and tracked over a sliding window with size 11. For a fair comparison, filters on each robot were initialized with the same values, and the same settings above were used for all the robots and algorithms.</p><p>It is worth noting that throughout the entire trajectories, common features only occasionally appear. Table <ref type="table">III</ref> shows the average RMSE results of three robots and Fig. <ref type="figure">3</ref> shows the Relative Pose Error (RPE) results computed over three robots' trajectories. It is clear that in term of both RMSE and RPE, DISC-VIO and CEC-VIO improve the accuracy for all three robots with limited common features, and CEC-VIO achieves the best performance as expected. Additionally, we timed the complete execution time of all three robots at each frame that mainly includes the timing for propagation, tracking, matching across robots for C-VIO, update and marginalization. As shown in Fig. <ref type="figure">4</ref>, probably due to the fixed weights used in CI, DISC-VIO only takes a bit more time than OpenVINS.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>VI. CONCLUSIONS AND FUTURE WORK</head><p>In this paper, we have developed two cooperative VIO (C-VIO) algorithms for multi-robot cooperative localization. (i) The first one is CEC-VIO that has centralized performance but distributed state propagation and clone, which sets up the benchmark performance; (ii) The second one is DISC-VIO which leverages CI to design a fully distributed C-VIO algorithm. DISC-VIO is shown to be more scalable and efficient and robust due to its fully distributed architecture. Common features tracked over the sliding-window are exploited to improve the localization accuracy for both algorithms. The  In the future, we will further improve our algorithms by including SLAM features in the state vector as well as using robot-to-robot measurements (if available), and deploy the the proposed algorithms on real multi-MAV systems. </p></div></body>
		</text>
</TEI>
