In an input-queued switch, a crossbar schedule, or a matching between the input ports and the output ports needs to be computed for each switching cycle, or time slot. It is a challenging research problem to design switching algorithms that produce high-quality matchings yet have a very low computational complexity when the switch has a large number of ports. Indeed, there appears to be a fundamental tradeoff between the computational complexity of the switching algorithm and the quality of the computed matchings.
Parallel maximal matching algorithms (adapted for switching) appear to be a sweet tradeoff point in this regard. On one hand, they provide the following performance guarantees: Using maxi- mal matchings as crossbar schedules results in at least 50% switch throughput and order-optimal (i.e., independent of the switch size 𝑁 ) average delay bounds for various traffic arrival processes. On the other hand, their computational complexities can be as low as 𝑂 (log2 𝑁 ) per port/processor, which is much lower than those of the algorithms for finding matchings of higher qualities such as maximum weighted matching.
In this work, we propose QPS-r, a parallel iterative switching
algorithm that has the lowest possible computational complexity:
𝑂(1) per port. Yet, the matchings that QPS-r computes have the
same quality as maximal matchings in the following sense: Using
such matchings as crossbar schedules results in exactly the same
aforementioned provable throughput and delay guarantees as using
maximal matchings, as we show using Lyapunov stability analysis.
Although QPS-r builds upon an existing add-on technique called
Queue-Proportional Sampling (QPS), we are the first to discover
and prove this nice property of such matchings. We also demon-
strate that QPS-3 (running 3 iterations) has comparable empirical
throughput and delay performances as iSLIP (running log 𝑁 itera- 2
tions), a refined and optimized representative maximal matching algorithm adapted for switching.
more »
« less
Sliding-Window QPS (SW-QPS): A Perfect Parallel Iterative Switching Algorithm for Input-Queued Switches
In this work, we first propose a parallel batch switching algorithm called Small-Batch Queue-Proportional Sampling (SB-QPS). Compared to other batch switching algorithms, SB-QPS significantly reduces the batch size without sacrificing the throughput performance and hence has much lower delay when traffic load is light to moderate. It also achieves the lowest possible time complexity of O(1) per matching computation per port, via parallelization. We then propose another algorithm called Sliding-Window QPS (SW-QPS). SW-QPS retains and enhances all benefits of SB-QPS, and reduces the batching delay to zero via a novel switching framework called sliding-window switching. In addition, SW-QPS computes matchings of much higher qualities, as measured by the resulting throughput and delay performances, than QPS-1, the state-of-the-art regular switching algorithm that builds upon the same underlying bipartite matching algorithm.
more »
« less
- Award ID(s):
- 1909048
- NSF-PAR ID:
- 10296676
- Date Published:
- Journal Name:
- ACM SIGMETRICS Performance Evaluation Review
- Volume:
- 48
- Issue:
- 3
- ISSN:
- 0163-5999
- Page Range / eLocation ID:
- 71 to 76
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In an input-queued switch, a crossbar schedule, or a matching between the input ports and the output ports needs to be computed for each switching cycle, or time slot. It is a challenging research problem to design switching algorithms that produce high-quality matchings yet have a very low computational complexity when the switch has a large number of ports. Indeed, there appears to be a fundamental tradeoff between the computational complexity of the switching algorithm and the quality of the computed matchings. Parallel maximal matching algorithms (adapted for switching) appear to be a sweet tradeoff point in this regard. On one hand, they provide the following performance guarantees: Using maxi- mal matchings as crossbar schedules results in at least 50% switch throughput and order-optimal (i.e., independent of the switch size 𝑁 ) average delay bounds for various traffic arrival processes. On the other hand, their computational complexities can be as low as 𝑂 (log_2 𝑁) per port/processor, which is much lower than those of the algorithms for finding matchings of higher qualities such as maximum weighted matching. In this work, we propose QPS-r, a parallel iterative switching algorithm that has the lowest possible computational complexity: 𝑂(1) per port. Yet, the matchings that QPS-r computes have the same quality as maximal matchings in the following sense: Using such matchings as crossbar schedules results in exactly the same aforementioned provable throughput and delay guarantees as using maximal matchings, as we show using Lyapunov stability analysis. Although QPS-r builds upon an existing add-on technique called Queue-Proportional Sampling (QPS), we are the first to discover and prove this nice property of such matchings. We also demon- strate that QPS-3 (running 3 iterations) has comparable empirical throughput and delay performances as iSLIP (running log 𝑁 itera- 2 tions), a refined and optimized representative maximal matching algorithm adapted for switching.more » « less
-
Krause, Andreas and (Ed.)General function approximation is a powerful tool to handle large state and action spaces in a broad range of reinforcement learning (RL) scenarios. However, theoretical understanding of non-stationary MDPs with general function approximation is still limited. In this paper, we make the first such an attempt. We first propose a new complexity metric called dynamic Bellman Eluder (DBE) dimension for non-stationary MDPs, which subsumes majority of existing tractable RL problems in static MDPs as well as non-stationary MDPs. Based on the proposed complexity metric, we propose a novel confidence-set based model-free algorithm called SW-OPEA, which features a sliding window mechanism and a new confidence set design for non-stationary MDPs. We then establish an upper bound on the dynamic regret for the proposed algorithm, and show that SW-OPEA is provably efficient as long as the variation budget is not significantly large. We further demonstrate via examples of non-stationary linear and tabular MDPs that our algorithm performs better in small variation budget scenario than the existing UCB-type algorithms. To the best of our knowledge, this is the first dynamic regret analysis in non-stationary MDPs with general function approximation.more » « less
-
We propose distributed scheduling algorithms that guarantee a constant fraction of the maximum throughput for typical wireless topologies, and have O(1) delay and complexity in the network size. Our algorithms resolve collisions among pairs of conflicting nodes by assigning a master-slave hierarchy. When the master-slave hierarchy is chosen randomly, our algorithm matches the throughput performance of the maximal scheduling policies, with a complexity and delay that do not scale with network size. When the master-slave hierarchy is chosen based on the network topology, the throughput performance of our algorithm is characterized by a parameter of the conflict graph called the master-interference degree. For commonly-used conflict-graph topologies, our results lead to the best known throughput guarantees among the algorithms that have O(1) delay and complexity. Numerical results indicate that our algorithms outperform the existing O(1) complexity algorithms like Q-CSMA.more » « less
-
We study the multi-player stochastic multiarmed bandit (MAB) problem in an abruptly changing environment. We consider a collision model in which a player receives reward at an arm if it is the only player to select the arm. We design two novel algorithms, namely, Round-Robin Sliding-Window Upper Confidence Bound# (RR-SW-UCB#), and the Sliding- Window Distributed Learning with Prioritization (SW-DLP). We rigorously analyze these algorithms and show that the expected cumulative group regret for these algorithms is upper bounded by sublinear functions of time, i.e., the time average of the regret asymptotically converges to zero. We complement our analytic results with numerical illustrations.more » « less