skip to main content


Title: Delay-aware Cellular Traffic Scheduling with Deep Reinforcement Learning
Abstract: Radio access network (RAN) in 5G is expected to satisfy the stringent delay requirements of a variety of applications. The packet scheduler plays an important role by allocating spectrum resources to user equipments (UEs) at each transmit time interval (TTI). In this paper, we show that optimal scheduling is a challenging combinatorial optimization problem, which is hard to solve within the channel coherence time with conventional optimization methods. Rule-based scheduling methods, on the other hand, are hard to adapt to the time-varying wireless channel conditions and various data request patterns of UEs. Recently, integrating artificial intelligence (AI) into wireless networks has drawn great interest from both academia and industry. In this paper, we incorporate deep reinforcement learning (DRL) into the design of cellular packet scheduling. A delay-aware cell traffic scheduling algorithm is developed to map the observed system state to scheduling decision. Due to the huge state space, a recurrent neural network (RNN) is utilized to approximate the optimal action-policy function. Different from conventional rule-based scheduling methods, the proposed scheme can learn from the interactions with the environment and adaptively choosing the best scheduling decision at each TTI. Simulation results show that the DRL-based packet scheduling can achieve the lowest average delay compared with several conventional approaches. Meanwhile, the UEs' average queue lengths can also be significantly reduced. The developed method also exhibits great potential in real-time scheduling in delay-sensitive scenarios.  more » « less
Award ID(s):
1821819 1822055
NSF-PAR ID:
10276412
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
GLOBECOM 2020 - 2020 IEEE Global Communications Conference
Page Range / eLocation ID:
1 to 6
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Mobile wireless networks present several challenges for any learning system, due to uncertain and variable device movement, a decentralized network architecture, and constraints on network resources. In this work, we use deep reinforcement learning (DRL) to learn a scalable and generalizable forwarding strategy for such networks. We make the following contributions: i) we use hierarchical RL to design DRL packet agents rather than device agents, to capture the packet forwarding decisions that are made over time and improve training efficiency; ii) we use relational features to ensure generalizability of the learned forwarding strategy to a wide range of network dynamics and enable offline training; and iii) we incorporate both forwarding goals and network resource considerations into packet decision-making by designing a weighted DRL reward function. Our results show that our DRL agent often achieves a similar delay per packet delivered as the optimal forwarding strategy and outperforms all other strategies including state-of-the-art strategies, even on scenarios on which the DRL agent was not trained. 
    more » « less
  2. We consider an energy harvesting sensor transmit- ting latency-sensitive data over a fading channel. We aim to find the optimal transmission scheduling policy that minimizes the packet queuing delay given the available harvested energy. We formulate the problem as a Markov decision process (MDP) over a state-space spanned by the transmitter's buffer, battery, and channel states, and analyze the structural properties of the resulting optimal value function, which quantifies the long-run performance of the optimal scheduling policy. We show that the optimal value function (i) is non- decreasing and has increasing differences in the queue backlog; (ii) is non-increasing and has increasing differences in the battery state; and (iii) is submodular in the buffer and battery states. Our numerical results confirm these properties and demonstrate that the optimal scheduling policy outperforms a so-called greedy policy in terms of sensor outages, buffer overflows, energy efficiency, and queuing delay. 
    more » « less
  3. Age of Information (AoI), measures the time elapsed since the last received information packet was generated at the source. We consider the problem of AoI minimization for singlehop flows in a wireless network, under pairwise interference constraints and time varying channel. We consider simple, yet broad, class of distributed scheduling policies, in which a transmission is attempted over each link with a certain attempt probability. We obtain an interesting relation between the optimal attempt probability and the optimal AoI of the link, and its neighboring links. We then show that the optimal attempt probabilities can be computed by solving a convex optimization problem, which can be done distributively. 
    more » « less
  4. In this paper, we consider the problem of joint offloading and wireless scheduling design for parallel computing applications with hard deadlines. This is motivated by the rapid growth of compute-intensive mobile parallel computing applications (e.g., real-time video analysis, language translation) that require to be processed within a hard deadline. While there are many works on joint computing and communication algorithm design, most of them focused on the minimization of average computing time and may not be applicable for mobile applications with hard deadlines. In this work, we explicitly take hard deadlines for computing tasks into account and develop a joint offloading and scheduling algorithm based on the stochastic network optimization framework. The proposed algorithm is shown to achieve average energy consumption arbitrarily close to the optimal one. However, this algorithm involves a strong coupling between offloading and scheduling decisions, which yields significant challenges on its implementation. Towards this end, we first successfully decouple the offloading and scheduling decisions in the case with one time slot deadline by exploring the intrinsic structure of the proposed algorithm. Based on this, we further implement the proposed algorithm in the general setups. Simulations are provided to corroborate our findings. 
    more » « less
  5. null (Ed.)
    Machine learning applied to architecture design presents a promising opportunity with broad applications. Recent deep reinforcement learning (DRL) techniques, in particular, enable efficient exploration in vast design spaces where conventional design strategies may be inadequate. This paper proposes a novel deep reinforcement framework, taking routerless networks-on-chip (NoC) as an evaluation case study. The new framework successfully resolves problems with prior design approaches, which are either unreliable due to random searches or inflexible due to severe design space restrictions. The framework learns (near-)optimal loop placement for routerless NoCs with various design constraints. A deep neural network is developed using parallel threads that efficiently explore the immense routerless NoC design space with a Monte Carlo search tree. Experimental results show that, compared with conventional mesh, the proposed deep reinforcement learning (DRL) routerless design achieves a 3.25x increase in throughput, 1.6x reduction in packet latency, and 5x reduction in power. Compared with the state-of-the-art routerless NoC, DRL achieves a 1.47x increase in throughput, 1.18x reduction in packet latency, 1.14x reduction in average hop count, and 6.3% lower power consumption. 
    more » « less