skip to main content


Title: Deep Q-Network Based Power Allocation Meets Reservoir Computing in Distributed Dynamic Spectrum Access Networks
Dynamic spectrum access (DSA) is regarded as one of the key enabling technologies for future communication networks. In this paper, we introduce a power allocation strategy for distributed DSA networks using a powerful machine learning tool, namely deep reinforcement learning. The introduced power allocation strategy enables DSA users to conduct power allocation in a distributed fashion without relying on channel state information and cooperations among DSA users. Furthermore, to capture the temporal correlation of the underlying DSA network environments, the reservoir computing, a special class of recurrent neural network, is employed to realize the introduced deep reinforcement learning scheme. The combination of reservoir computing and deep reinforcement learning significantly improves the efficiency of the introduced resource allocation scheme. Simulation evaluations are conducted to demonstrate the effectiveness of the introduced power allocation strategy.  more » « less
Award ID(s):
1811497 1802710 1811720
NSF-PAR ID:
10161678
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
IEEE INFOCOM 2019 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)
Page Range / eLocation ID:
774 to 779
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Federated learning (FL) is a highly pursued machine learning technique that can train a model centrally while keeping data distributed. Distributed computation makes FL attractive for bandwidth limited applications especially in wireless communications. There can be a large number of distributed edge devices connected to a central parameter server (PS) and iteratively download/upload data from/to the PS. Due to limited bandwidth, only a subset of connected devices can be scheduled in each round. There are usually millions of parameters in the state-of-art machine learning models such as deep learning, resulting in a high computation complexity as well as a high communication burden on collecting/distributing data for training. To improve communication efficiency and make the training model converge faster, we propose a new scheduling policy and power allocation scheme using non-orthogonal multiple access (NOMA) settings to maximize the weighted sum data rate under practical constraints during the entire learning process. NOMA allows multiple users to transmit on the same channel simultaneously. The user scheduling problem is transformed into a maximum-weight independent set problem that can be solved using graph theory. Simulation results show that the proposed scheduling and power allocation scheme can help achieve a higher FL testing accuracy in NOMA based wireless networks than other existing schemes within the same learning time. 
    more » « less
  2. null (Ed.)
    The Reservoir Computing, a neural computing framework suited for temporal information processing, utilizes a dynamic reservoir layer for high-dimensional encoding, enhancing the separability of the network. In this paper, we exploit a Deep Learning (DL)-based detection strategy for Multiple-input, Multiple-output Orthogonal Frequency-Division Multiplexing (MIMO-OFDM) symbol detection. To be specific, we introduce a Deep Echo State Network (DESN), a unique hierarchical processing structure with multiple time intervals, to enhance the memory capacity and accelerate the detection efficiency. The resulting hardware prototype with the hybrid memristor-CMOS co-design provides in-memory computing and parallel processing capabilities, significantly reducing the hardware and power overhead. With the standard 180nm CMOS process and memristive synapses, the introduced DESN consumes merely 105mW of power consumption, exhibiting 16.7% power reduction compared to shallow ESN designs even with more dynamic layers and associated neurons. Furthermore, numerical evaluations demonstrate the advantages of the DESN over state-of-the-art detection techniques in the literate for MIMO-OFDM systems even with a very limited training set, yielding a 47.8% improvement against conventional symbol detection techniques. 
    more » « less
  3. Mobile edge computing (MEC) is an emerging paradigm that integrates computing resources in wireless access networks to process computational tasks in close proximity to mobile users with low latency. In this paper, we propose an online double deep Q networks (DDQN) based learning scheme for task assignment in dynamic MEC networks, which enables multiple distributed edge nodes and a cloud data center to jointly process user tasks to achieve optimal long-term quality of service (QoS). The proposed scheme captures a wide range of dynamic network parameters including non-stationary node computing capabilities, network delay statistics, and task arrivals. It learns the optimal task assignment policy with no assumption on the knowledge of the underlying dynamics. In addition, the proposed algorithm accounts for both performance and complexity, and addresses the state and action space explosion problem in conventional Q learning. The evaluation results show that the proposed DDQN-based task assignment scheme significantly improves the QoS performance, compared to the existing schemes that do not consider the effects of network dynamics on the expected long-term rewards, while scaling reasonably well as the network size increases. 
    more » « less
  4. We consider the problem of spectrum sharing by multiple cellular operators. We propose a novel deep Reinforcement Learning (DRL)-based distributed power allocation scheme which utilizes the multi-agent Deep Deterministic Policy Gradient (MA-DDPG) algorithm. In particular, we model the base stations (BSs) that belong to the multiple operators sharing the same band, as DRL agents that simultaneously determine the transmit powers to their scheduled user equipment (UE) in a synchronized manner. The power decision of each BS is based on its own observation of the radio environment (RF) environment, which consists of interference measurements reported from the UEs it serves, and a limited amount of information obtained from other BSs. One advantage of the proposed scheme is that it addresses the single-agent non-stationarity problem of RL in the multi-agent scenario by incorporating the actions and observations of other BSs into each BS's own critic which helps it to gain a more accurate perception of the overall RF environment. A centralized-training-distributed-execution framework is used to train the policies where the critics are trained over the joint actions and observations of all BSs while the actor of each BS only takes the local observation as input in order to produce the transmit power. Simulation with the 6 GHz Unlicensed National Information Infrastructure (U-NII)-5 band shows that the proposed power allocation scheme can achieve better throughput performance than several state-of-the-art approaches. 
    more » « less
  5. Millimeter-wave (mmWave) communication is anticipated to provide significant throughout gains in urban scenarios. To this end, network densification is a necessity to meet the high traffic volume generated by smart phones, tablets, and sensory devices while overcoming large pathloss and high blockages at mmWaves frequencies. These denser networks are created with users deploying small mm Wave base stations (BSs) in a plug-and-play fashion. Although, this deployment method provides the required density, the amorphous deployment of BSs needs distributed management. To address this difficulty, we propose a self-organizing method to allocate power to mm Wave BSs in an ultra dense network. The proposed method consists of two parts: clustering using fast local clustering and power allocation via Q-learning. The important features of the proposed method are its scalability and self-organizing capabilities, which are both important features of 5G. Our simulations demonstrate that the introduced method, provides required quality of service (QoS) for all the users independent of the size of the network. 
    more » « less