In an inputqueued switch, a crossbar schedule, or a matching between the input ports and the output ports needs to be computed for each switching cycle, or time slot. It is a challenging research problem to design switching algorithms that produce highquality matchings yet have a very low computational complexity when the switch has a large number of ports. Indeed, there appears to be a fundamental tradeoff between the computational complexity of the switching algorithm and the quality of the computed matchings.
Parallel maximal matching algorithms (adapted for switching) appear to be a sweet tradeoff point in this regard. On one hand, they provide the following performance guarantees: Using maxi mal matchings as crossbar schedules results in at least 50% switch throughput and orderoptimal (i.e., independent of the switch size š ) average delay bounds for various traffic arrival processes. On the other hand, their computational complexities can be as low as š (log2 š ) per port/processor, which is much lower than those of the algorithms for finding matchings of higher qualities such as maximum weighted matching.
In this work, we propose QPSr, a parallel iterative switching
algorithm that has the lowest possible computational complexity:
š(1) per port. Yet, the matchings that QPSr computes have the
same quality as maximal matchings in the following sense: Using
such matchings as crossbar schedules results in exactly the same
aforementioned provable throughput and delay guarantees as using
maximal matchings, as we show using Lyapunov stability analysis.
Although QPSr builds upon an existing addon technique called
QueueProportional Sampling (QPS), we are the first to discover
and prove this nice property of such matchings. We also demon
strate that QPS3 (running 3 iterations) has comparable empirical
throughput and delay performances as iSLIP (running log š itera 2
tions), a refined and optimized representative maximal matching algorithm adapted for switching.
more »
« less
QPSr: A CostEffective Iterative Switching Algorithm for InputQueued Switches
In an inputqueued switch, a crossbar schedule, or a matching between the input ports and the output ports needs to be computed for each switching cycle, or time slot. It is a challenging research problem to design switching algorithms that produce highquality matchings yet have a very low computational complexity when the switch has a large number of ports. Indeed, there appears to be a fundamental tradeoff between the computational complexity of the switching algorithm and the quality of the computed matchings.
Parallel maximal matching algorithms (adapted for switching) appear to be a sweet tradeoff point in this regard. On one hand, they provide the following performance guarantees: Using maxi mal matchings as crossbar schedules results in at least 50% switch throughput and orderoptimal (i.e., independent of the switch size š ) average delay bounds for various traffic arrival processes. On the other hand, their computational complexities can be as low as š (log_2 š) per port/processor, which is much lower than those of the algorithms for finding matchings of higher qualities such as maximum weighted matching.
In this work, we propose QPSr, a parallel iterative switching
algorithm that has the lowest possible computational complexity:
š(1) per port. Yet, the matchings that QPSr computes have the
same quality as maximal matchings in the following sense: Using
such matchings as crossbar schedules results in exactly the same
aforementioned provable throughput and delay guarantees as using
maximal matchings, as we show using Lyapunov stability analysis.
Although QPSr builds upon an existing addon technique called
QueueProportional Sampling (QPS), we are the first to discover
and prove this nice property of such matchings. We also demon
strate that QPS3 (running 3 iterations) has comparable empirical
throughput and delay performances as iSLIP (running log š itera 2
tions), a refined and optimized representative maximal matching algorithm adapted for switching.
more »
« less
 NSFPAR ID:
 10167945
 Date Published:
 Journal Name:
 Proc. of Valuetools 2020
 Page Range / eLocation ID:
 19 to 26
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


null (Ed.)In this work, we first propose a parallel batch switching algorithm called SmallBatch QueueProportional Sampling (SBQPS). Compared to other batch switching algorithms, SBQPS significantly reduces the batch size without sacrificing the throughput performance and hence has much lower delay when traffic load is light to moderate. It also achieves the lowest possible time complexity of O(1) per matching computation per port, via parallelization. We then propose another algorithm called SlidingWindow QPS (SWQPS). SWQPS retains and enhances all benefits of SBQPS, and reduces the batching delay to zero via a novel switching framework called slidingwindow switching. In addition, SWQPS computes matchings of much higher qualities, as measured by the resulting throughput and delay performances, than QPS1, the stateoftheart regular switching algorithm that builds upon the same underlying bipartite matching algorithm.more » « less

We propose distributed scheduling algorithms that guarantee a constant fraction of the maximum throughput for typical wireless topologies, and have O(1) delay and complexity in the network size. Our algorithms resolve collisions among pairs of conflicting nodes by assigning a masterslave hierarchy. When the masterslave hierarchy is chosen randomly, our algorithm matches the throughput performance of the maximal scheduling policies, with a complexity and delay that do not scale with network size. When the masterslave hierarchy is chosen based on the network topology, the throughput performance of our algorithm is characterized by a parameter of the conflict graph called the masterinterference degree. For commonlyused conflictgraph topologies, our results lead to the best known throughput guarantees among the algorithms that have O(1) delay and complexity. Numerical results indicate that our algorithms outperform the existing O(1) complexity algorithms like QCSMA.more » « less

This article presents a dualband power amplifier for 28 and 39 GHz frequency bands based on a new dualpath transformer (DPT). This DPT can provide two optimum inductive values at two different frequency bands to optimally design the matching networks for each band without using any switch circuitries. It operates as the output and input matching networks in a parallel power combiner and divider, respectively. DPTbased PA breaks the tradeoff between bandwidth and performance in conventional wideband PAs by separating one whole wideband into two narrow bands providing optimum input and output matchings for each band. The DPTbased PA has two input and two output ports. One set of input and output ports is dedicated to a lower frequency band and the other set of input and outport ports can be used for a higher frequency band. Each output port can drive a separate antenna in a phased array for each frequency band. The proposed PA prototype is fabricated in a 65 nm CMOS process achieving 15.3 and 14.0 dBm of saturated output power in 28 and 39 GHz. The peak efficiency of the PA is 34.1% and 30.2% at 28 and 39 GHz frequency bands. The PA has a measured EVM with 64QAM modulated signal in both frequency bands showing ā25.03 and ā25.10 dB in the low and higher frequency bands, respectively.more » « less

We study the problem of scheduling VMs (Virtual Machines) in a distributed server platform, motivated by cloud computing applications. The VMs arrive dynamically over time to the system, and require a certain amount of resources (e.g. memory, CPU, etc) for the duration of their service. To avoid costly preemptions, we consider nonpreemptive scheduling: Each VM has to be assigned to a server which has enough residual capacity to accommodate it, and once a VM is assigned to a server, its service cannot be disrupted (preempted). Prior approaches to this problem either have high complexity, require synchronization among the servers, or yield queue sizes/delays which are excessively large. We propose a nonpreemptive scheduling algorithm that resolves these issues. In general, given an approximation algorithm to Knapsack with approximation ratio r , our scheduling algorithm can provide rĪ² fraction of the throughput region for Ī² < r. In the special case of a greedy approximation algorithm to Knapsack, we further show that this condition can be relaxed to Ī²<1. The parameters Ī² and r can be tuned to provide a tradeoff between achievable throughput, delay, and computational complexity of the scheduling algorithm. Finally extensive simulation results using both synthetic and real traffic traces are presented to verify the performance of our algorithm.more » « less