skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, October 10 until 2:00 AM ET on Friday, October 11 due to maintenance. We apologize for the inconvenience.


Title: QPS-r: A Cost-Effective Iterative Switching Algorithm for Input-Queued Switches
In an input-queued switch, a crossbar schedule, or a matching between the input ports and the output ports needs to be computed for each switching cycle, or time slot. It is a challenging research problem to design switching algorithms that produce high-quality matchings yet have a very low computational complexity when the switch has a large number of ports. Indeed, there appears to be a fundamental tradeoff between the computational complexity of the switching algorithm and the quality of the computed matchings. Parallel maximal matching algorithms (adapted for switching) appear to be a sweet tradeoff point in this regard. On one hand, they provide the following performance guarantees: Using maxi- mal matchings as crossbar schedules results in at least 50% switch throughput and order-optimal (i.e., independent of the switch size š‘ ) average delay bounds for various traffic arrival processes. On the other hand, their computational complexities can be as low as š‘‚ (log_2 š‘) per port/processor, which is much lower than those of the algorithms for finding matchings of higher qualities such as maximum weighted matching. In this work, we propose QPS-r, a parallel iterative switching algorithm that has the lowest possible computational complexity: š‘‚(1) per port. Yet, the matchings that QPS-r computes have the same quality as maximal matchings in the following sense: Using such matchings as crossbar schedules results in exactly the same aforementioned provable throughput and delay guarantees as using maximal matchings, as we show using Lyapunov stability analysis. Although QPS-r builds upon an existing add-on technique called Queue-Proportional Sampling (QPS), we are the first to discover and prove this nice property of such matchings. We also demon- strate that QPS-3 (running 3 iterations) has comparable empirical throughput and delay performances as iSLIP (running log š‘ itera- 2 tions), a refined and optimized representative maximal matching algorithm adapted for switching.  more » « less
Award ID(s):
1909048 1850439
NSF-PAR ID:
10167945
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proc. of Valuetools 2020
Page Range / eLocation ID:
19 to 26
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In an input-queued switch, a crossbar schedule, or a matching between the input ports and the output ports needs to be computed for each switching cycle, or time slot. It is a challenging research problem to design switching algorithms that produce high-quality matchings yet have a very low computational complexity when the switch has a large number of ports. Indeed, there appears to be a fundamental tradeoff between the computational complexity of the switching algorithm and the quality of the computed matchings. Parallel maximal matching algorithms (adapted for switching) appear to be a sweet tradeoff point in this regard. On one hand, they provide the following performance guarantees: Using maxi- mal matchings as crossbar schedules results in at least 50% switch throughput and order-optimal (i.e., independent of the switch size š‘ ) average delay bounds for various traffic arrival processes. On the other hand, their computational complexities can be as low as š‘‚ (log2 š‘ ) per port/processor, which is much lower than those of the algorithms for finding matchings of higher qualities such as maximum weighted matching. In this work, we propose QPS-r, a parallel iterative switching algorithm that has the lowest possible computational complexity: š‘‚(1) per port. Yet, the matchings that QPS-r computes have the same quality as maximal matchings in the following sense: Using such matchings as crossbar schedules results in exactly the same aforementioned provable throughput and delay guarantees as using maximal matchings, as we show using Lyapunov stability analysis. Although QPS-r builds upon an existing add-on technique called Queue-Proportional Sampling (QPS), we are the first to discover and prove this nice property of such matchings. We also demon- strate that QPS-3 (running 3 iterations) has comparable empirical throughput and delay performances as iSLIP (running log š‘ itera- 2 tions), a refined and optimized representative maximal matching algorithm adapted for switching. 
    more » « less
  2. null (Ed.)
    In this work, we first propose a parallel batch switching algorithm called Small-Batch Queue-Proportional Sampling (SB-QPS). Compared to other batch switching algorithms, SB-QPS significantly reduces the batch size without sacrificing the throughput performance and hence has much lower delay when traffic load is light to moderate. It also achieves the lowest possible time complexity of O(1) per matching computation per port, via parallelization. We then propose another algorithm called Sliding-Window QPS (SW-QPS). SW-QPS retains and enhances all benefits of SB-QPS, and reduces the batching delay to zero via a novel switching framework called sliding-window switching. In addition, SW-QPS computes matchings of much higher qualities, as measured by the resulting throughput and delay performances, than QPS-1, the state-of-the-art regular switching algorithm that builds upon the same underlying bipartite matching algorithm. 
    more » « less
  3. We propose distributed scheduling algorithms that guarantee a constant fraction of the maximum throughput for typical wireless topologies, and have O(1) delay and complexity in the network size. Our algorithms resolve collisions among pairs of conflicting nodes by assigning a master-slave hierarchy. When the master-slave hierarchy is chosen randomly, our algorithm matches the throughput performance of the maximal scheduling policies, with a complexity and delay that do not scale with network size. When the master-slave hierarchy is chosen based on the network topology, the throughput performance of our algorithm is characterized by a parameter of the conflict graph called the master-interference degree. For commonly-used conflict-graph topologies, our results lead to the best known throughput guarantees among the algorithms that have O(1) delay and complexity. Numerical results indicate that our algorithms outperform the existing O(1) complexity algorithms like Q-CSMA. 
    more » « less
  4. This article presents a dual-band power amplifier for 28 and 39 GHz frequency bands based on a new dual-path transformer (DPT). This DPT can provide two optimum inductive values at two different frequency bands to optimally design the matching networks for each band without using any switch circuitries. It operates as the output and input matching networks in a parallel power combiner and divider, respectively. DPT-based PA breaks the trade-off between bandwidth and performance in conventional wideband PAs by separating one whole wideband into two narrow bands providing optimum input and output matchings for each band. The DPT-based PA has two input and two output ports. One set of input and output ports is dedicated to a lower frequency band and the other set of input and outport ports can be used for a higher frequency band. Each output port can drive a separate antenna in a phased array for each frequency band. The proposed PA prototype is fabricated in a 65 nm CMOS process achieving 15.3 and 14.0 dBm of saturated output power in 28 and 39 GHz. The peak efficiency of the PA is 34.1% and 30.2% at 28 and 39 GHz frequency bands. The PA has a measured EVM with 64-QAM modulated signal in both frequency bands showing āˆ’25.03 and āˆ’25.10 dB in the low and higher frequency bands, respectively. 
    more » « less
  5. We study the problem of scheduling VMs (Virtual Machines) in a distributed server platform, motivated by cloud computing applications. The VMs arrive dynamically over time to the system, and require a certain amount of resources (e.g. memory, CPU, etc) for the duration of their service. To avoid costly preemptions, we consider non-preemptive scheduling: Each VM has to be assigned to a server which has enough residual capacity to accommodate it, and once a VM is assigned to a server, its service cannot be disrupted (preempted). Prior approaches to this problem either have high complexity, require synchronization among the servers, or yield queue sizes/delays which are excessively large. We propose a non-preemptive scheduling algorithm that resolves these issues. In general, given an approximation algorithm to Knapsack with approximation ratio r , our scheduling algorithm can provide rĪ² fraction of the throughput region for Ī² < r. In the special case of a greedy approximation algorithm to Knapsack, we further show that this condition can be relaxed to Ī²<1. The parameters Ī² and r can be tuned to provide a tradeoff between achievable throughput, delay, and computational complexity of the scheduling algorithm. Finally extensive simulation results using both synthetic and real traffic traces are presented to verify the performance of our algorithm. 
    more » « less