We consider an energy harvesting sensor transmit- ting latency-sensitive data over a fading channel. We aim to find the optimal transmission scheduling policy that minimizes the packet queuing delay given the available harvested energy. We formulate the problem as a Markov decision process (MDP) over a state-space spanned by the transmitter's buffer, battery, and channel states, and analyze the structural properties of the resulting optimal value function, which quantifies the long-run performance of the optimal scheduling policy. We show that the optimal value function (i) is non- decreasing and has increasing differences in the queue backlog; (ii) is non-increasing and has increasing differences in the battery state; and (iii) is submodular in the buffer and battery states. Our numerical results confirm these properties and demonstrate that the optimal scheduling policy outperforms a so-called greedy policy in terms of sensor outages, buffer overflows, energy efficiency, and queuing delay.
more »
« less
Deep Reinforcement Learning for Delay-Sensitive LTE Downlink Scheduling
We consider an LTE downlink scheduling system where a base station allocates resource blocks (RBs) to users running delay-sensitive applications. We aim to find a scheduling policy that minimizes the queuing delay experienced by the users. We formulate this problem as a Markov Decision Process (MDP) that integrates the channel quality indicator (CQI) of each user in each RB, and queue status of each user. To solve this complex problem involving high dimensional state and action spaces, we propose a Deep Reinforcement Learning based scheduling framework that utilizes the Deep Deterministic Policy Gradient (DDPG) algorithm to minimize the queuing delay experienced by the users. Our extensive experiments demonstrate that our approach outperforms state-of-the-art benchmarks in terms of average throughput, queuing delay, and fairness, achieving up to 55% lower queuing delay than the best benchmark.
more »
« less
- PAR ID:
- 10253848
- Date Published:
- Journal Name:
- 2020 IEEE 31st Annual International Symposium on Personal, Indoor and Mobile Radio Communications
- Page Range / eLocation ID:
- 1 to 6
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Abstract: Radio access network (RAN) in 5G is expected to satisfy the stringent delay requirements of a variety of applications. The packet scheduler plays an important role by allocating spectrum resources to user equipments (UEs) at each transmit time interval (TTI). In this paper, we show that optimal scheduling is a challenging combinatorial optimization problem, which is hard to solve within the channel coherence time with conventional optimization methods. Rule-based scheduling methods, on the other hand, are hard to adapt to the time-varying wireless channel conditions and various data request patterns of UEs. Recently, integrating artificial intelligence (AI) into wireless networks has drawn great interest from both academia and industry. In this paper, we incorporate deep reinforcement learning (DRL) into the design of cellular packet scheduling. A delay-aware cell traffic scheduling algorithm is developed to map the observed system state to scheduling decision. Due to the huge state space, a recurrent neural network (RNN) is utilized to approximate the optimal action-policy function. Different from conventional rule-based scheduling methods, the proposed scheme can learn from the interactions with the environment and adaptively choosing the best scheduling decision at each TTI. Simulation results show that the DRL-based packet scheduling can achieve the lowest average delay compared with several conventional approaches. Meanwhile, the UEs' average queue lengths can also be significantly reduced. The developed method also exhibits great potential in real-time scheduling in delay-sensitive scenarios.more » « less
-
With the wide adoption of deep neural network (DNN) models for various applications, enterprises, and cloud providers have built deep learning clusters and increasingly deployed specialized accelerators, such as GPUs and TPUs, for DNN training jobs. To arbitrate cluster resources among multi-user jobs, existing schedulers fall short, either lacking fine-grained heterogeneity awareness or hardly generalizable to various scheduling policies. To fill this gap, we propose a novel design of a task-level heterogeneity-aware scheduler, Hadar, based on an online optimization framework that can express other scheduling algorithms. Hadar leverages the performance traits of DNN jobs on a heterogeneous cluster, characterizes the task-level performance heterogeneity in the optimization problem, and makes scheduling decisions across both spatial and temporal dimensions. The primal-dual framework is employed, with our design of a dual subroutine, to solve the optimization problem and guide the scheduling design. Extensive trace-driven simulations with representative DNN models have been conducted to demonstrate that Hadar improves the average job completion time (JCT) by 3× over an Apache YARN-based resource manager used in production. Moreover, Hadar outperforms Gavel[1], the state-of-the-art heterogeneity-aware scheduler, by 2.5× for the average JCT, and shortens the queuing delay by 13% and improve FTF (Finish-Time-Fairness) by 1.5%.more » « less
-
Hemmer, Philip R. ; Migdall, Alan L. (Ed.)We study a quantum switch that creates shared end-to-end entangled quantum states to multiple sets of users that are connected to it. Each user is connected to the switch via an optical link across which bipartite Bell-state entangled states are generated in each time-slot with certain probabilities, and the switch merges entanglements of links to create end-to-end entanglements for users. One qubit of an entanglement of a link is stored at the switch and the other qubit of the entanglement is stored at the user corresponding to the link. Assuming that qubits of entanglements of links decipher after one time-slot, we characterize the capacity region, which is defined as the set of arrival rates of requests for end-to-end entanglements for which there exists a scheduling policy that stabilizes the switch. We propose a Max-Weight scheduling policy and show that it stabilizes the switch for all arrival rates that lie in the capacity region. We also provide numerical results to support our analysis.more » « less
-
null (Ed.)Time-Sensitive Networking (TSN) is designed for real-time applications, usually pertaining to a set of Time-Triggered (TT) data flows. TT traffic generally requires low packet loss and guaranteed upper bounds on end-to-end delay. To guarantee the end-to-end delay bounds, TSN uses Time-Aware Shaper (TAS) to provide deterministic service to TT flows. Each frame of TT traffic is scheduled a specific time slot at each switch for its transmission. Several factors may influence frame transmissions, which then impact the scheduling in the whole network. These factors may cause frames sent in wrong time slots, namely misbehaviors. To mitigate the occurrence of misbehaviors, we need to find proper scheduling for the whole network. In our research, we use a reinforcement-learning model, which is called Deep Deterministic Policy Gradient (DDPG), to find the suitable scheduling. DDPG is used to model the uncertainty caused by the transmission-influencing factors such as time-synchronization errors. Compared with the state of the art, our approach using DDPG significantly decreases the number of misbehaviors in TSN scenarios studied and improves the delay performance of the network.more » « less