<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>UD-MIMO: Uplink Distributed MIMO for Wireless LANs</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>07/06/2021</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10290284</idno>
					<idno type="doi">10.1109/SECON52354.2021.9491622</idno>
					<title level='j'>2021 18th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON)</title>
<idno></idno>
<biblScope unit="volume"></biblScope>
<biblScope unit="issue"></biblScope>					

					<author>Hossein Pirayesh</author><author>Pedram Kheirkhah Sangdeh</author><author>Qiben Yan</author><author>Huacheng Zeng</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Wireless local area networks (WLANs) are a key component of the telecommunications infrastructure in our society. While many solutions have been produced to improve their downlink throughput, the techniques for enhancing their uplink throughput remain limited. The stagnation can be attributed to the lack of fine-grained inter-node synchronization due to the hardware limitation of most devices. In this paper, we present an uplink distributed multiple-input-and-multiple-output scheme (termed UD-MIMO) for WLANs to enable concurrent uplink transmission in the absence of fine-grained inter-node synchronization. The enabling technique behind UD-MIMO is a practical solution to decoding uplink packets from asynchronous users. UD-MIMO makes it possible for WLANs to significantly improve their uplink throughput while not requiring tight internode synchronization. We have built a prototype of UD-MIMO on a wireless testbed and demonstrate its compatibility with commercial off-the-shelf Atheros 802.11 client devices (with modified Linux driver). Our experimental results show that, for a WLAN with 8 APs in a conference room, UD-MIMO offers 3.4× throughput compared to interference-avoidance approach.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. INTRODUCTION</head><p>The proliferation of wireless devices under the driving forces from emerging concepts such as smart cities, intelligent transportation systems, and the Internet of Things has led to unprecedented demands for wireless services. Cisco predicts that the wireless demands would double in the next two years and reach 120 exabytes per month by 2021 <ref type="bibr">[1]</ref>. As a key component of the telecommunications infrastructure in our society, wireless local area networks (WLANs) carry even more data traffic for mobile devices than cellular networks. The predicament facing WLANs is that the increase of their capacity cannot catch up the growth of wireless demands. Such a predicament becomes particularly daunting in dense wireless environments such as conference rooms, football stadiums, cinemas, and airports.</p><p>A straightforward idea to increase the capacity of WLANs is to deploy more access points (APs) to enrich the service resources for users. This approach, however, does not work in dense wireless environments. The capacity of existing WLANs does not scale with the number of APs. This is because the existing WLANs use carrier sense multiple access (CSMA) protocol to manage the interference. Such an interferenceavoidance protocol only allows one AP to access the spectrum in a collision domain, no matter how many APs are deployed in this area. Another idea to increase the capacity of WLANs is to enhance AP's capability <ref type="bibr">[2]</ref>. Given the advancement of multiple-input-multiple-output (MIMO) technology in the past Fig. <ref type="figure">1</ref>: Illustrating distributed MIMO in a WLAN. decades <ref type="bibr">[3]</ref>, it is common nowadays that a commercial AP (Wi-Fi router) is equipped with multiple antennas. However, the advancement of individual AP cannot fundamentally solve the network capacity problem because the number of antennas on an AP is limited by its physical size.</p><p>Distributed MIMO has been widely regarded as a promising technique to improve the capacity of WLANs. Given the fact that APs are connected via high-speed Ethernet cables in some scenarios, all the APs can jointly process the signals from/to multiple users. With a proper design, the APs can serve many users simultaneously instead of being limited by their co-channel interference. Consider the WLAN in Fig. <ref type="figure">1</ref> for example. If the network uses CSMA-based interferenceavoidance technique, only one AP can access the spectrum at a time. In contrast, if the APs are jointly processing the signals, then the WLAN resembles a 4&#215;4 multi-user MIMO (MU-MIMO) system, making it possible for the APs to serve the four users simultaneously.</p><p>While the throughput gain of distributed MIMO is attractive, the realization of distributed MIMO in practical WLANs is challenging. The challenge lies in the clock/time synchronization among the network devices (AP and STAs). Despite the APs being connected via Ethernet cables, those connections are suitable for data packet transmission, but not suitable for clock/time synchronization. As such, synchronizing the network devices for distributed MIMO is not a trivial problem. Although some system papers have studied the synchronization issues to enable distributed MIMO for WLANs, most of them are focused on downlink transmission (see, e.g., <ref type="bibr">[4,</ref><ref type="bibr">5,</ref><ref type="bibr">6,</ref><ref type="bibr">7]</ref>). Very limited progress has been made so far in the design of a practical distributed MIMO scheme for concurrent uplink transmission. One may argue that most traffic in WLANs is carried by downlink, and thus the uplink capacity is not demanding. This may not be true in the next decades, given the increasing popularity of cloud-based 978-1-6654-4108-7/21/$31.00 &#169;2021 IEEE applications that require frequent data transmission from user devices to cloud <ref type="bibr">[8]</ref>.</p><p>In this paper, we present UD-MIMO, an uplink distributed MIMO scheme for WLANs. We consider a WLAN as shown in Fig. <ref type="figure">1</ref> and focus on the scenario of busy wireless environments such as conference rooms, enterprises, hotels, shopping malls, and airports. We assume that multiple users send their packets to the APs simultaneously. Upon reception of the mixed signals, the APs send the received signals to an AP processor via Ethernet connection, which typically has high speed and low latency. At the AP processor, a signal detection technique is employed to decode the data packets. In such a network, if the devices are perfectly synchronized, then UD-MIMO would be identical to MU-MIMO, and conventional multi-user detection (MUD) methods such as zero-forcing (ZF) and minimum mean square error (MMSE) would be able to decode signals at the AP processor. But in reality, the network devices are driven by independent oscillators. Consequently, they are neither time-aligned nor frequencysynchronized, making the signal detection problem particularly challenging.</p><p>One natural approach to solving the signal detection problem is by designing a sophisticated protocol to synchronize the network devices. This approach, however, has two issues. First, the time synchronization among stations (STAs, a.k.a. user devices) is not easy to achieve. In order for the APs to decode the packets, the time misalignment of STAs' transmissions should be less than the cyclic prefix (CP) of an orthogonal frequency division multiplexing (OFDM) symbol, which is 800 ns in 802.11 networks. Given STAs' mobility, achieving such a fine-grained time synchronization among all the STAs will incur a large amount of airtime overhead. This issue was reflected by IEEE 802.11ac standard <ref type="bibr">[9]</ref>, which supports downlink MU-MIMO but does not supports uplink MU-MIMO. Second, the time and frequency synchronizations of STA-side transmissions require hardware modification of the user devices. Doing so will make UD-MIMO not compatible with already-existing 802.11 devices. For these two reasons, synchronizing the STAs for uplink transmissions is not a good approach to pursue.</p><p>We, therefore, explore an alternative approach: Instead of synchronizing the STAs, we live with their asynchrony and tackle the issue on the AP side. Specifically, we develop a new MUD method that can decode the asynchronous data packets from multiple STAs. Through sophisticated signal processing functions, the new MUD method can decode the data packet from each STA by treating the packets from other STAs as interference. As such, it does not require synchronization among the STAs. This new MUD method not only removes the need for hardware modification of user devices, it also eliminates the huge airtime overhead induced by synchronization protocols.</p><p>We have built a prototype of UD-MIMO and evaluated its performance on two wireless testbeds: (i) The APs are custom-built using USRP devices, and the STAs are commercial Atheros 802.11 dongles with modified drivers. (ii) Both APs and STAs are custom-built using USRP devices. Based on our experimental results, we have the following observations: (i) UD-MIMO is compatible with commercial off-the-shelf Atheros 802.11 devices (with modified Linux driver). (ii) For a WLAN with 8 APs deployed in a conference room, UD-MIMO offers 3.4&#215; uplink throughput compared to CSMA-based interference-avoidance approach. Meanwhile, UD-MIMO achieves more than 82% throughput of MU-MIMO, where all the APs and STAs are perfectly synchronized via external clocks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>II. RELATED WORK</head><p>Synchronization in Distributed MIMO: <ref type="bibr">[4,</ref><ref type="bibr">5,</ref><ref type="bibr">6]</ref> are the most relevant papers to this work. In <ref type="bibr">[4]</ref>, a scheme called JMB (or MegaMIMO) was proposed to enable downlink distributed MIMO in WLANs. Its main efforts focus on realizing phase and time synchronizations among independent APs so that a joint beamforming technique can be used to enable downlink MU-MIMO transmission. A similar idea called Airsync was proposed in <ref type="bibr">[5]</ref> to address timing and carrier phase synchronizations for distributed downlink MU-MIMO transmission. One may wonder if the schemes proposed in <ref type="bibr">[4]</ref> and <ref type="bibr">[5]</ref> can be used to enable UD-MIMO as well. Actually, it cannot. Because doing so will require hardware modification of 802.11 client devices. In contrast, UD-MIMO not only maintains compatibility with 802.11 devices but also reduces the overhead (see Fig. <ref type="figure">2</ref> in this paper and Fig. <ref type="figure">3</ref> in <ref type="bibr">[4]</ref>).</p><p>In <ref type="bibr">[6]</ref>, a layering protocol called Chorus was proposed to achieve network-wide clock and time synchronization for LTE systems. However, Chorus relies on extra radio resource blocks and new hardware to update frequency shift and phase errors. It is therefore considered an expensive solution. Apparently, UD-MIMO takes a completely different approach. Synchronization in Wireless Networks: A large body of work (see, e.g., <ref type="bibr">[10,</ref><ref type="bibr">11,</ref><ref type="bibr">12,</ref><ref type="bibr">13,</ref><ref type="bibr">14,</ref><ref type="bibr">15]</ref>) studied time and frequency synchronizations in wireless networks. For example, <ref type="bibr">[10]</ref> proposed a distributed architecture called SourceSync to exploit the diversity of transmitters. Particularly, a specific protocol was proposed to meet the requirements of time synchronization on the transmitter side. Since SourceSync was dedicatedly devised for exploiting diversity, it cannot apply to distributed MIMO for spatial multiplexing. <ref type="bibr">[11]</ref> analyzed the time and frequency synchronizations in large-sized dense wireless networks. However, these results cannot directly be applied to distributed MIMO systems, either because they are limited to theoretical analysis or because they entail an overwhelmingly large amount of overhead. Performance of Distributed MIMO: <ref type="bibr">[16]</ref> presented Signpost, a scalable MU-MIMO scheme without CSI feedback. <ref type="bibr">[7]</ref> presented NEMOx, a hierarchical network architecture to achieve the scalability of distributed MIMO. <ref type="bibr">[17]</ref> studied the performance of different precoding techniques in downlink distributed MIMO systems. However, these efforts focused on the practical realization of distributed MIMO but did not take into account the synchronization issues. Our work is orthogonal to this research line and complements these efforts.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>III. UD-MIMO: AN UPLINK DISTRIBUTED MIMO SCHEME</head><p>We consider a dense WLAN as shown in Fig. <ref type="figure">1</ref>, which comprises M single-antenna APs, N single-antenna STAs, and an AP processor. Such a network could be a Wi-Fi network deployed in conference rooms, shopping malls, or airports. For this network, we have the following assumptions: (i) The APs are connected via a high-speed wired connection, which is only good for exchange data packets but not suitable for clock synchronization. This is true in reality. (ii) The STAs in the network could be incumbent 802.11a/g/n/ac user devices. While their software (firmware and driver) can be upgraded, their hardware (e.g., PLL circuit and baseband signal processing at the PHY layer) cannot be upgraded.</p><p>The objective of our design is to enable concurrent uplink transmissions in such a WLAN while preserving its compatibility with incumbent 802.11 client devices (STAs). As the performance of conventional WLANs is limited by cochannel interference, the success of UD-MIMO will significantly improve the network uplink throughput. For ease of exposition, we consider the network where each device has a single antenna. In the end, we shall see that UD-MIMO can also apply to the networks where devices have multiple antennas.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Our Approach</head><p>Given that the difference between UD-MIMO and point-topoint MIMO lies in the synchronizations, an intuitive approach to enable UD-MIMO is by designing a sophisticated mechanism to achieve the necessary synchronizations on both AP and STA sides. This approach, however, cannot maintain backward compatibility with existing 802.11 devices (STAs). This is because synchronizing the STAs requires the modification of their hardware. The estimation and compensation of carrier frequency offsets can only be done through baseband signal processing modules, which are hard-coded in ASIC chips and cannot be modified by upgrading the firmware or driver. Hence, synchronization operations cannot be conducted by existing 802.11 devices, and such an approach cannot maintain the backward compatibility of the network infrastructure. We propose a new approach for UD-MIMO. In our approach, the APs take full responsibility for addressing the synchronization issues, and the STAs do not need to perform any synchronization operations.  to estimate their carrier frequency offsets: Based on the received trigger frame from the lead AP, each slave AP estimates the carrier frequency offset between itself and the lead AP. The estimated carrier frequency offset is recorded at the slave AP and will be used in Step 2. &#8226; Step 2: Uplink data transmission. Upon reception of the trigger frame, the STAs prepare their data packets and send radio signals into the air simultaneously. More specifically, each STA uses an aggregate frame format in Fig. <ref type="figure">3</ref> for the uplink data transmission. Note that, since the STAs operate independently, their transmissions will not exactly start at the same time. A time misalignment may exist, as illustrated in Fig. <ref type="figure">2</ref>. On the AP side, each AP receives mixed radio signals from the STAs. Each slave AP compensates the carrier frequency offset between itself and the lead AP using the frequency offset value estimated in Step 1. Then, all the APs send their signal streams to the AP processor. &#8226; Step 3: Acknowledgment (ACK). Upon the decoding results, the lead AP broadcasts an ACK/NACK packet to the STAs. The ACK/NACK packet has the information of which packets from which STAs were not successfully decoded. Based on this information, each STA prepares a retransmission, if necessary, in the next round.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. UD-MIMO Protocol</head><p>A practical consideration is whether the trigger frame from the lead AP is sufficient for the slave APs to synchronize their carrier frequency in the time period of uplink data transmission. To address this concern, we study the stability of frequency synchronization among the lead and slave APs. Fig. <ref type="figure">4</ref> shows the measured carrier frequency offsets at seven slave APs and the residual carrier frequency offsets after the compensation of frequency offsets. It is evident that the residual carrier frequency offsets are less than 180 Hz in 4 ms. This accuracy is sufficient for concurrent data transmission.</p><p>It is evident that the proposed protocol is simple and has low airtime overhead. But a big question is yet to be answered: how can the AP processor decode the data packets from the STAs? We focus on this question in the next section. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IV. PACKET DETECTION</head><p>At the AP processor, decoding the data packets faces the following two challenges. First, since the STAs are driven by independent clocks, their carrier frequencies are not exactly the same. Consequently, from the APs' perspective, the received signals from different STAs have different carrier frequency offsets, which must be compensated for signal detection. However, it remains unknown how to handle such heterogeneous carrier frequency offsets at a MIMO receiver. Second, since the STAs operate independently, their uplink data packets are unlikely to be aligned in the time domain. Moreover, for 802.11 devices, the misalignment of data packets is hardly confined within the duration of OFDM's CP (800 ns). Such a time misalignment makes it hard for the AP processor to decode the data packets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Overview</head><p>To address the STA-side asynchrony issue, we propose a new packet detection method for the AP processor. This new detection method lives with the STA-side asynchrony and tackles the asynchrony issue through baseband signal processing on the AP side (i.e., at the AP processor). Fig. <ref type="figure">5</ref> shows the schematic diagram. The AP processor continuously receives the signal streams from the APs. From the signal streams, it extracts N signal frames, each of which corresponds to a packet from one STA. Then, it decodes each of the N signal frames separately, as illustrated in Fig. <ref type="figure">5</ref>.</p><p>Consider one of the N signal frames, for example. Suppose that it corresponds to the data packet from STA i. We note that this signal frame includes not only the desired signal from STA i but also the undesired signals (interference) from other STAs. To decode this signal frame, the AP processor first performs carrier frequency correction. This module will estimate and compensate the carrier frequency offset between STA i and the APs (assuming the APs have been perfectly synchronized in Step 1 of our protocol). Then, the AP processor converts the signal to the frequency domain for signal detection. When performing signal detection, the AP processor treats the signals from other STAs (all the STAs except STA i) as unknown interference and constructs spatial filters to cancel the interference and equalize the channel distortion.   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Frame Extraction</head><p>To extract signal frames from the signal streams, the AP processor employs cross-correlation. Specifically, the AP processor correlates each signal stream with a local copy of L-LTF. If the normalized correlation value is greater than a predefined correlation threshold (e.g., 0.4), then it is regarded as the start of a signal frame. Fig. <ref type="figure">6</ref> illustrates the extraction procedure when the network has one STA, and Fig. <ref type="figure">7</ref> illustrates the correlation peaks when the network has three STAs. When there are multiple correlation peaks, their corresponding signal frames are matched in order at each AP (see Fig. <ref type="figure">7</ref> as an example).</p><p>It is worth pointing out that the cross-correlation is conducted in the presence of interference and noise. The correlation value is therefore dependent on the strength of interference and the noise power. Hence, the correlation threshold should be meticulously chosen. A small threshold may lead to a false positive, and a large threshold may lead to a false negative. In our experiments, we set the threshold to 0.4 and find that it works well for the following two reasons: (i) Cross-correlation itself is resilient to interference. L-LTF has 160 samples and appears to be robust against interference. (ii) False negatives are acceptable for packet detection. A small correlation value (below the threshold) indicates that the desired signal is weak and the interference is strong. The exclusion of this stream will actually improve the performance of packet detection, provided that the spatial DoF is sufficient for packet detection.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Carrier Frequency Correction</head><p>Referring to Fig. <ref type="figure">5</ref>, let us consider one of the N signal frames. Suppose it corresponds to the data packet from STA i. To decode this signal frame, the AP processor first performs carrier frequency correction. In the presence of inter-user (inter-STA) interference, conventional methods do not work because they are susceptible to interference. To tackle this issue, we propose a new method, which comprises two steps: , where arg(&#8226;) is the angle of a complex number, (&#8226;) * is complex conjugate operator, C is the set of samples in the CP of all OFDM symbols, and 64 is the distance between CP and its original copy in OFDM symbol.</p><p>After obtaining &#952;i , we then compensate the carrier frequency offset by letting &#563;(n) = y(n)&#8226;e jn &#952;i , 1 &#8804; n &#8804; N s . The resultant signal frame &#563;(n) is then sent to the FFT module, as shown in Fig. <ref type="figure">5</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>D. Interference Cancellation and Signal Recovery</head><p>Problem Formulation: After correcting the carrier frequency offset, the FFT module in Fig. <ref type="figure">5</ref> converts the signal frame from the time domain to the frequency domain. We let Y j (l, k) denote the output signals from the FFT module, where j &#8712; {1, 2, &#8226; &#8226; &#8226; , M } is the index of received signal streams (APs), l &#8712; {1, 2, &#8226; &#8226; &#8226; , L} is the index of OFDM symbols, and k &#8712; {1, 2, &#8226; &#8226; &#8226; , K} is the index of OFDM subcarriers. Assume that the signals in a frame experience block channel fading. Then, the signal transfer function can be written as:</p><p>where H ji (k) is the channel between AP j and STA i, X i (l, k) is the original signal from STA i, Xi (l, k) is the interfering signal from STA i , and N is the set of STAs.</p><p>For the transfer function in (2), we have the following three remarks. First, this transfer function requires carrier frequency synchronization between STA i and the APs. But it does not require phase synchronization between STAs and APs. Actually, the phase offset between STA i and AP j is considered as a part of H ji (k). Second, X i (l, k) in ( <ref type="formula">2</ref>) is the original signal transmitted by STA i. But Xi (l, k) is not the original signal transmitted by STA i &#8712; N /{i}. This is because Xi (l, k) is completely distorted by the carrier frequency offset and time misalignment between STA i and the APs. It is considered unknown interference in this transfer function. Third, since Xi (l, k) is an unknown interfering signal, it is hard to estimate the channel H ji (k). This is the core challenge in signal recovery. Our Detection Method: Based on (2), if we know all the channels, then we can construct a spatial filter</p><p>. Such a spatial filter can cancel the interference and recover the desired signal. This method is actually the well-known zero-forcing MIMO detector. By taking into account the effect of noise, the zeroforcing detector can be elevated to an MMSE detector. Now the question is how to construct the spatial filter G(k) in the absence of channel knowledge. To address this question, we propose a training-based method. Consider the signal frame transmission from STA i to the APs. As illustrated in Fig. <ref type="figure">8</ref>(a), we assume that (i) STA i's preamble is interfered by unknown signals from other STAs; and (ii) STA i's preamble is independent of its interference. Then, we focus on the received signal frame at the AP processor, which is illustrated in Fig. <ref type="figure">8(b)</ref>. The signal frame is composed of two parts: interfered preamble and interfered data. Since the preamble is known at the AP processor a priori, we use the interfered preamble in Fig. <ref type="figure">8(b</ref>) as the training sequence to construct the spatial filter for data detection. Specifically, we construct the spatial filter as follows:</p><p>is a set of reference symbols in the preamble. Q(k) can be empirically set. In our experiments, we let Q(k) = {1&#8804;l &#8804; 4, k -1&#8804;k &#8804; k + 1}, as shown in Fig. <ref type="figure">9</ref>.</p><p>After constructing the spatial filter, we then use it to estimate the original signal by:</p><p>where Xi (l, k) is the estimated signal from STA i and G j (k) is the jth entry in vector (filter) G(k).</p><p>Discussions: We have the following two remarks on the proposed packet detection method. First, the proposed method does not require channel knowledge to decode the packet, as evidenced by ( <ref type="formula">3</ref>) and ( <ref type="formula">4</ref>). Instead, it uses the interfered preamble as the training sequence to construct a filter, which is then used to cancel the interference and equalize the channel distortion for signal recovery. Second, the proposed detection method is a heuristic. We will resort to experiments to evaluate its performance. As we will see in Section VI-A, this detection method yields surprisingly superior performance. With this detection method, the performance of UD-MIMO is close to that of MU-MIMO in all tested scenarios.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>V. COMPATIBILITY WITH 802.11 CLIENT DEVICES</head><p>In this section, we first point out the practical issues when UD-MIMO works with incumbent off-the-shelf Wi-Fi client devices and then propose a solution to these issues. Practical Issues: UD-MIMO heavily relies on the new packet detection method to tame the asynchrony among the STAs. However, the packet detection method was proposed under the following two assumptions: (i) Referring to Fig. <ref type="figure">8(a)</ref>, STA i's preamble is interfered by the signals from other STAs. (ii) Referring to Fig. <ref type="figure">8</ref>(a) again, STA i's preamble is linearly independent of the interfering signals from other STAs. To see why these two assumptions are mandatory, let us consider the examples in Fig. <ref type="figure">10</ref>.</p><p>In Fig. <ref type="figure">10(a)</ref>, the transmission misalignment of the two STAs is greater than the time duration of the preamble. In this case, STA 1's first frame cannot be decoded at the AP processor. This is because its preamble is not interfered by the signal from STA 2. As a result, the spatial filter constructed based on this preamble cannot cancel the interference from STA 2. For STA 1's second frame, it can be decoded at the AP processor because its preamble is interfered by the signal from STA 2. In contrast, for STA 2's two frames, both of them can be decoded at the AP processor because their preambles are interfered by the signal from STA 1.</p><p>In Fig. <ref type="figure">10</ref>(b), the transmission misalignment of the two STAs is less than the time duration of CP (guard interval). In this case, all of the four frames (two for STA 1 and two for STA 2) cannot be decoded at the AP processor. This is because their preambles are interfered by the same interference. From the AP processor's perspective, it has no way to differentiate the signal from STA 1 and that from STA 2. The spatial filter constructed based on the interfered preamble can neither cancel interference nor equalize the channel distortion. Therefore, all four frames cannot be decoded.</p><p>Such cases are likely to occur in practice. In Wi-Fi networks, the preamble is 16 microseconds, and the CP is 0.8 microseconds for long guard interval option and 0.4 microseconds for short guard interval option. Suppose that the synchronization error achieved by the timing synchronization function (TSF) specified in IEEE 802.11 is uniformly distributed in [0, 20] microseconds <ref type="bibr">[18]</ref>. Then, the probability of case 1 is about 20%, and the probability of case 2 is 4% (or 2% for short guard interval option). Collectively, the probability of these two cases is 24%. Therefore, properly handling these two cases is imperative towards the real application of UD-MIMO. Our Solution: To fulfill those two assumptions in the presence of STAs' transmission misalignment, our solution is simple. We insert a dummy packet at the beginning of uplink transmission at each STA, as illustrated in Fig. <ref type="figure">11</ref>. By properly setting the length of the dummy packet, the preamble of each data packet will meet those two requirements. To determine the length of the dummy packet at each STA, let us again assume that the transmission misalignment is within 20 microseconds <ref type="bibr">[18]</ref>. Then, we set the length of the dummy packet at STA i to 7i OFDM symbols (1 &#8804; i &#8804; N ), including both preamble and data.</p><p>For the proposed solution, three remarks are in order. Remark 1: To fulfill those two assumptions, the preambles from different STAs can partially overlap with each other. For example, the L-STF from one STA can overlap with the L-STF from another STA. In such a case, the AP processor can still decode the packets. Taking this fact into consideration may help reduce the length of dummy packets at the STAs.</p><p>Remark 2: Apparently, the proposed solution entails additional airtime overhead to enable UD-MIMO transmission. Further, the overhead slightly increases with the number of STAs. This issue can be alleviated by aggregating more packets (signal frames) in the uplink transmission. Since the channel coherence time is long enough in WLANs, an aggregate frame can accommodate hundreds of OFDM symbols. Then, the amortized overhead is acceptable in practice.</p><p>Remark 3: Since the dummy packet is a normal packet, no hardware modification is needed to insert the dummy packet for a commercial Wi-Fi client device. Rather, it can be implemented through modifying a Wi-Fi device's driver. The length of the dummy packet can be specified in the trigger frame by the lead AP. On the AP side, the AP processor will automatically drop the dummy packet, either because it cannot be decoded or it does not have necessary MAC information. In either case, the dummy packet will not affect the upper-layer applications.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>VI. EXPERIMENTAL EVALUATION</head><p>In this section, we conduct experiments to evaluate the performance of UD-MIMO on the two wireless testbeds.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Implementation and Experimental Setup</head><p>We have built two testbeds to evaluate the performance of UD-MIMO in real wireless environments. 802.11 Wi-Fi Dongle Testbed: The purpose of this testbed is to validate the practicality of UD-MIMO as well as its compatibility with commercial off-the-shelf Wi-Fi devices.</p><p>For the STAs, we use Wi-Fi dongles (Alfa AWUS036NHA Wireless USB Adapters), which are built on Qualcomm Atheros AR9271 chipset <ref type="bibr">[19]</ref> and support IEEE 802.11b/g/n. We modify its firmware (modwifi-ath9k-htc in <ref type="bibr">[20]</ref>) to disable carrier sense, RTS/CTS, ACK, set SIFS/AIFS to zero, and insert a dummy packet for UD-MIMO. For simplicity, we fix the MCS index to 2, which corresponds to QPSK modulation, 3/4 coding rate, and 18 Mbps data rate. While we use this specific modulation and coding scheme (MCS), UD-MIMO works with other MCS as well. We set channel bandwidth to 20 MHz and guard interval (OFDM CP) to 800 ns. The transmit power is fixed to 17 dBm, and the carrier frequency is set to 2.427 GHz (channel 4).</p><p>For the APs, we implement them using a set of USRP N210 devices <ref type="bibr">[21]</ref>. Each USRP N210 device is connected to a D-Link SmartPro Switch via 1Gbps Ethernet RJ45 Cord, and the switch is connected to a computer via 10Gbps SFP+ DAC Cable. A software suite is developed using C++ and deployed at the computer to implement the protocol (see Fig. <ref type="figure">2</ref>) and process the baseband signals. The output of our software suite is estimated signals from the STAs. Post-processing modules (e.g., deinterleaving, channel decoding, descrambling, and decryption) are not implemented. USRP Testbed: The purpose of this testbed is to quantify the performance gap between UD-MIMO and MU-MIMO in the same scenarios. In this testbed, both APs and STAs are custom-built using USRP N210 devices. As such, we have full control for both APs and STAs. To measure the performance of MU-MIMO, we synchronize both APs and STAs using external clocks (Ettus' Octoclock CDA-2990G <ref type="bibr">[22]</ref>). For ease of experimentation, we set the sampling rate to 5 Msps. Other parameters are the same as the 802.11 testbed. Experimental Setup: Fig. <ref type="figure">12</ref> shows our experimental setup in a large conference room, where 8 APs and N STAs are deployed. The 8 APs are placed at the locations marked by 80 ft Fig. <ref type="figure">12:</ref> A conference room for UD-MIMO evaluation. Performance Metrics: We consider two metrics. The first one is error vector magnitude (EVM), which is defined by: EVM (dB) = 10 log 10</p><p>, where X is the original signal at STA and X is the estimated signal at AP. The second one is data rate, which is calculated by r = 48 80 &#215; b &#215; &#947;(EVM) Mbps, where 48 is the number of subcarriers used for payload in an OFDM symbol, 80 is the length of one OFDM symbol (including CP), b is the signal sampling rate (in Msps), and &#947;(EVM) is the average number of bits carried by one symbol and its values are given in Table <ref type="table">I</ref>. In our experiments, we use b = 20 for the 802.11 testbed and b = 5 for the USRP testbed.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. 802.11 Wi-Fi Dongle Testbed</head><p>On this testbed, we measure the uplink data rate per STA in two schemes: CSMA (interference avoidance) and UD-MIMO. CSMA: In conventional Wi-Fi networks, only one Wi-Fi dongle can be allowed to communicate with one AP in a time slot. As such, we consider four dongles in four different time slots. In the ith time slot, dongle i placed at location d i sends data packet to the lead AP, 1 &#8804; i &#8804; 4.</p><p>Since each time slot has only one active dongle, there is no interference in this case. Fig. <ref type="figure">13</ref> plots the demodulated signal at the lead AP in the four time slots. Specifically, the measured EVMs for the four dongles are -18.7 dB, -21.9 dB, -26.5 dB, -20.4 dB, respectively. We then extrapolate the data rate based on the measured EVM values. Since each dongle uses one-fourth of time resources for data transmission, the data rate should be divided by four in the calculation. Therefore, the calculated data rate is 6.0 Mbps for dongle 1,     </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>D. USRP Testbed</head><p>On the USRP testbed, we measure the performance of UD-MIMO and MU-MIMO in the same scenarios and quantify their performance gap. CSMA: This is an interference-free case. The interference is avoided in the time domain by assigning different STAs into different time slots. The STAs send their packets to the lead  Fig. <ref type="figure">17</ref>: The data rate achieved by each of the N STAs.</p><p>AP using a round-robin scheduler. We measure the EVM of the decoded signal for each STA at the lead AP. Fig. <ref type="figure">15</ref>(a) plots the distribution of our measured EVM when the STA is placed throughout all the locations (small circles) in Fig. <ref type="figure">12</ref>. Fig. <ref type="figure">15</ref>(b) plots the distribution of our calculated data rate.</p><p>Since the AP serves a single STA in each time slot, the average uplink throughput achieved by CSMA is 9.3 Mbps.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>UD-MIMO versus MU-MIMO:</head><p>In UD-MIMO, we let the 8 APs serve N STAs simultaneously in the uplink, where 1 &#8804; N &#8804; 6. In each instance, we place the N STAs at N different locations (the small circles) in Fig. <ref type="figure">12</ref>. At the computer, we measure the EVM of demodulated uplink signal from each of the N STAs. We then repeat the same measurements for MU-MIMO, for which we synchronize the 8 APs using an external clock (10 MHz reference signal and 1 PPS) and synchronize the N STAs using another external clock. Fig. <ref type="figure">16</ref> plots the distribution of our measured EVM when UD-MIMO and MU-MIMO are used. In UD-MIMO, the average EVM of the demodulated uplink signals is -28.1 dB when the APs serve one STA, -24.5 dB when the APs serve two STAs, -20.4 dB when the APs serve three STAs, -18.3 dB when the APs serve four STAs, -17.2 dB when the APs serve five STAs, and -14.3 dB when the APs serve six STAs. Moreover, as shown in the figure, the EVM gap of UD-MIMO and MU-MIMO is only about 2.0 dB. This means that UD-MIMO successfully resolves the synchronization issues in distributed WLANs.</p><p>We extrapolate the measured EVM to each STA's data rate (with b = 5 Msps). Fig. <ref type="figure">17</ref>   Throughput Comparison: Finally, we compare the total uplink throughput achieved by CSMA, UD-MIMO, and MU-MIMO. For CSMA, since it serves one STA at a time, the total uplink throughput is the average of all STAs' data rates. For UD-MIMO and MU-MIMO, since it serves N STAs simultaneously, the total uplink throughput is the multiplication of N and the average of per-STA data rate. Fig. <ref type="figure">18</ref> presents the comparison of the total uplink throughput achieved by the three techniques. It is evident that, in each case, the throughput of UD-MIMO is much higher than that of CSMA and close to that of MU-MIMO. On average of the six cases, UD-MIMO achieves 3.4&#215; throughput compared to CSMA and achieves 82% throughput of MU-MIMO. One may notice that the throughput of all three techniques decreases when the number of STAs increases from 5 to 6. This is because the sixth STA brings significant co-channel interference to the existing 5 STAs. The significant increase of co-channel interference can be attributed to the ill-conditioned MIMO channel between the 6 STAs and the 8 APs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>VII. CONCLUSION</head><p>In this paper, we presented UD-MIMO, a practical uplink distributed MIMO scheme for WLANs. UD-MIMO enables concurrent data transmissions from multiple STAs to multiple APs. UD-MIMO is compatible with commercial off-the-shelf 802.11 devices (with modified driver). The enabler behind UD-MIMO is a new signal detection method, which can decode concurrent data packets from asynchronous STAs. We have built a prototype of UD-MIMO on two wireless testbeds and demonstrated its compatibility with Qualcomm Atheros 802.11 devices. Our experimental results show that UD-MIMO offers 3.4&#215; throughput compared to the CSMA-based interferenceavoidance approach. Our experimental results also show that UD-MIMO achieves 82% throughput of MU-MIMO.</p></div></body>
		</text>
</TEI>
