skip to main content

Title: Approximately Reversible Stochastic Processing Networks
The property of (quasi-)reversibility of Markov chains have led to elegant characterization of steady-state distribution for complex queueing networks, e.g. celebrated Jackson networks, BCMP (Baskett, Chandi, Muntz, Palacois) and Kelly theorem. In a nutshell, despite the complicated interaction, in the steady-state, the queues in such networks exhibit independence and subsequently lead to explicit calculations of distributional properties of the queuing network that may seem impossible at the outset. The model of stochastic processing network (cf. Harrison 2000) captures variety of dynamic resource allocation problems including the flow-level networks used for modeling bandwidth sharing in the Internet, switched networks (cf. Shah, Wischik 2006) for modeling packet scheduling in the Internet router and wireless medium access, and hybrid flow-packet networks for modeling job-and-packet level scheduling in data centers. Unlike before, an appropriate resource allocation or scheduling policy is required in such networks to achieve good performance. Given the complexity, asymptotic analytic approaches such as fluid model or Lyapunov-Foster criteria to establish positive-recurrence and heavy traffic or diffusion approximation to characterize the scaled steady-state distribution became method of choice. A remarkable progress has been made along these lines over the past few decades, but there is a need for much more to match more » the explicit calculations in the context of reversible networks. In this work, we will present an alternative to this approach that leads to non-asymptotic, explicit characterization of steady-state distribution akin BCMP / Kelly theorems. This involves (a) identifying a "relaxation" of the given stochastic processing network in terms of an appropriate (quasi-)reversible queueing network, and (b) finding a resource allocation or scheduling policy of interest that "emulates" the "relaxed" networks within "small error". The proof is in the puddling -- we will present three examples of this program: (i) distributed scheduling in wireless network, (ii) scheduling in switched networks, and (iii) flow-packet scheduling in a data center. The notion of "baseline performance" (cf. Harrison, Mandayam, Shah, Yang 2014) will naturally emerges as a consequence of this program. We will discuss open questions pertaining multi-hop networks and computation complexity. « less
Award ID(s):
Publication Date:
Journal Name:
INFORMS Applied Probability Society Conference
Sponsoring Org:
National Science Foundation
More Like this
  1. Core-Stateless Fair Queueing (CSFQ) is a scalable algorithm proposed more than two decades ago to achieve fair queueing without keeping per-flow state in the network. Unfortunately, CSFQ did not take off, in part because it required protocol changes (i.e., adding new fields to the packet header), and hardware support to process packets at line rate. In this paper, we argue that two emerging trends are making CSFQ relevant again: (1) cloud computing which makes it feasible to change the protocol within the same datacenter or across datacenters owned by the same provider, and (2) programmable switches which can implement sophisticated packet processing at line rate. To this end, we present the first realization of CSFQ using programmable switches. In addition, we generalize CSFQ to a multi-level hierarchy, which naturally captures the traffic in today's datacenters, e.g., tenants at the first level and flows of each tenant at the second level of the hierarchy. We call this scheduler Hierarchical Core-Stateless Fair Queueing (HCSFQ), and show that it is able to accurately approximate hierarchical fair queueing. HCSFQ is highly scalable: it uses just a single FIFO queue, does not perform per-packet scheduling, and only needs to maintain state for the interior nodesmore »of the hierarchy. We present analytical results to prove the lower bounds of HCSFQ. Our testbed experiments and large-scale simulations show that CSFQ and HCSFQ can provide fair bandwidth allocation and ensure isolation.« less
  2. The concept of Industry 4.0 introduces the unification of industrial Internet-of-Things (IoT), cyber physical systems, and data-driven business modeling to improve production efficiency of the factories. To ensure high production efficiency, Industry 4.0 requires industrial IoT to be adaptable, scalable, real-time, and reliable. Recent successful industrial wireless standards such as WirelessHART appeared as a feasible approach for such industrial IoT. For reliable and real-time communication in highly unreliable environments, they adopt a high degree of redundancy. While a high degree of redundancy is crucial to real-time control, it causes a huge waste of energy, bandwidth, and time under a centralized approach and are therefore less suitable for scalability and handling network dynamics. To address these challenges, we propose DistributedHART—a distributed real-time scheduling system for WirelessHART networks. The essence of our approach is to adopt local (node-level) scheduling through a time window allocation among the nodes that allows each node to schedule its transmissions using a real-time scheduling policy locally and online. DistributedHART obviates the need of creating and disseminating a central global schedule in our approach, thereby significantly reducing resource usage and enhancing the scalability. To our knowledge, it is the first distributed real-time multi-channel scheduler for WirelessHART. We havemore »implemented DistributedHART and experimented on a 130-node testbed. Our testbed experiments as well as simulations show at least 85% less energy consumption in DistributedHART compared to existing centralized approach while ensuring similar schedulability.« less
  3. We consider a wireless network with a base station serving multiple traffic streams to different destinations. Packets from each stream arrive to the base station according to a stochastic process and are enqueued in a separate (per stream) queue. The queueing discipline controls which packet within each queue is available for transmission. The base station decides, at every time t, which stream to serve to the corresponding destination. The goal of scheduling decisions is to keep the information at the destinations fresh. Information freshness is captured by the Age of Information (AoI) metric. In this paper, we derive a lower bound on the AoI performance achievable by any given network operating under any queueing discipline. Then, we consider three common queueing disciplines and develop both an Optimal Stationary Randomized policy and a Max-Weight policy under each discipline. Our approach allows us to evaluate the combined impact of the stochastic arrivals, queueing discipline and scheduling policy on AoI. We evaluate the AoI performance both analytically and using simulations. Numerical results show that the performance of the Max-Weight policy is close to the analytical lower bound.
  4. Wireless networks are being applied in various industrial sectors, and they are posed to support mission-critical industrial IoT applications which require ultra-reliable, low-latency communications (URLLC). Ensuring predictable per-packet communication reliability is a basis of predictable URLLC, and scheduling and power control are two basic enablers. Scheduling and power control, however, are subject to challenges such as harsh environments, dynamic channels, and distributed network settings in industrial IoT. Existing solutions are mostly based on heuristic algorithms or asymptotic analysis of network performance, and there lack field-deployable algorithms for ensuring predictable per-packet reliability. Towards addressing the gap, we examine the cross-layer design of joint scheduling and power control and analyze the associated challenges. We introduce the Perron–Frobenius theorem to demonstrate that scheduling is a must for ensuring predictable communication reliability, and by investigating characteristics of interference matrices, we show that scheduling with close-by links silent effectively constructs a set of links whose required reliability is feasible with proper transmission power control. Given that scheduling alone is unable to ensure predictable communication reliability while ensuring high throughput and addressing fast-varying channel dynamics, we demonstrate how power control can help improve both the reliability at each time instant and throughput in the long-term. Basedmore »on the analysis, we propose a candidate framework of joint scheduling and power control, and we demonstrate how this framework behaves in guaranteeing per-packet communication reliability in the presence of wireless channel dynamics of different time scales. Collectively, these findings provide insight into the cross-layer design of joint scheduling and power control for ensuring predictable per-packet reliability in the presence of wireless network dynamics and uncertainties.« less
  5. The past decade has witnessed the rapid development of real-time wireless technologies and their wide adoption in various industrial Internet-of-Things (IIoT) applications. Among those wireless technologies, 6TiSCH is a promising candidate as the de facto standard due to its nice feature of gluing a real-time link-layer standard (802.15.4e, for offering deterministic communication performance) together with an IP-enabled upper-layer stack (for seamlessly supporting Internet services). 6TiSCH's built-in random slot selection scheduling algorithm, however, often leads to large and unbounded transmission latency, thus can hardly meet the real-time requirements of IIoT applications. This paper proposes an adaptive partition based scheduling framework, APaS, for 6TiSCH networks. APaS introduces the concept of resource partitioning into 6TiSCH network management. Instead of allocating network resources to individual devices, APaS partitions and assigns network resources to different groups of devices based on their layers in the network so as to guarantee that the transmission latency of any end-toend flow is within one slotframe length. APaS also employs a novel online partition adjustment method to further improve its adaptability to dynamic network topology changes. The effectiveness of APaS is validated through both simulation and testbed experiments on a 122-node multi-hop 6TiSCH network.