skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Reinforcement Learning for Mixed Cooperative/Competitive Dynamic Spectrum Access
A dynamic spectrum sharing problem with a mixed collaborative/competitive objective and partial information about peers’ performances that arises from the DARPA Spectrum Collaboration Challenge is considered. Because of the very high complexity of the problem and the enormous size of the state space, it is broken down into the subproblems of channel selection, flow admission control, and transmission schedule assignment. The channel selection problem is the focus of this paper. A reinforcement learning algorithm based on a reduced state is developed to select channels, and a neural network is used as a function approximator to fill in missing values in the resulting input-action matrix. The performance is compared with that obtained by a hand-tuned expert system.  more » « less
Award ID(s):
1642973
PAR ID:
10133367
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the 2019 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN)
Page Range / eLocation ID:
1 to 6
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The success of dynamic spectrum sharing in wireless networks depends on reliable automated enforcement of spectrum access policies. In this paper, a crowdsourced approach is used to select volunteers to detect spectrum misuse. Volunteer selection is based on multiple criteria, including their reputation, likelihood of being in a region and ability to effectively detect channel misuse. We formulate the volunteer selection problem as a stable matching problem, whereby, volunteers' monitoring preferences are matched to channels' attributes. Given a set of volunteers, the objective is to ensure maximum coverage of the spectrum enforcement area and accurate detection of spectrum access violation of all channels in the area. The two matching algorithms, Volunteer Matching (VM) and Reverse Volunteer Matching (RVM) are based on variants of the Gale-Shapley algorithm for stable matching. We also propose two Hybrid algorithms, HYBRID-VM and HYBRID-RVM that augment the matching algorithms with a Secretary-based algorithm to overcome the shortcomings of the individual vanilla algorithms. Simulation results show that volunteer selection by using HYBRID-VM gives better coverage of region (better by 19.2% when compared to threshold-based Secretary algorithm), better accuracy of detection and better volunteer happiness when compared to the other algorithms that are tested. 
    more » « less
  2. We consider the optimal link rate selection problem in time-varying wireless channels with unknown channel statistics. The aim of optimal link rate selection is to transmit at the optimal rate at each time slot in order to maximize the expected throughput of the wireless channel/link or equivalently minimize the expected regret. Lack of information about channel state or channel statistics necessitates the use of online/sequential learning algorithms to determine the optimal rate. We present an algorithm called CoTS - Constrained Thompson sampling algorithm which improves upon the current state-of-the-art, is fast and is also general in the sense that it can handle several different constraints in the problem with the same algorithm. We also prove an asymptotic lower bound on the expected regret and a high probability large-horizon upper bound on the regret, which show that the regret grows logarithmically with time in an order sense. We also provide numerical results which establish that CoTS significantly outperforms the current state-of-the-art algorithms. 
    more » « less
  3. In this paper, we address the problem of channel allocation for femtocells that share the use of regular macrocell spectrum. The femto basestation (FBS) scheduling problem is formulated in the form of restless multiarmed bandit (RMAB) framework. Our goal is to choose the arms/channels that maximize the total expected discounted reward over infinite horizon while minimizing the induced interference due to channel sharing with macrocell. Without direct observation of true channel state, we use the available macrocell user feedback known as channel quality indicator (CQI). In general, the RMAB problem is P-SPACE hard. We propose a heuristic low complexity indexing policy referred as approximated Whittle index to rank available channels for FBS. Although finding a closed form channel ranking solution typically involve dynamic programming, we show that based on the partial channel information within CQI, there exists a closed form for the channel index. Moreover, we demonstrate the performance advantage of the proposed indexing policy over a myopic policy. 
    more » « less
  4. This paper investigates the resource allocation problem in device-to-device (D2D)-based vehicular communications, based on slow fading statistics of channel state information (CSI), to alleviate signaling overhead for reporting rapidly varying accurate CSI of mobile links. We consider the case when each vehicle-to-infrastructure (V2I) link shares spectrum with multiple vehicle-to-vehicle (V2V) links. Leveraging the slow fading statistical CSI of mobile links, we maximize the sum V2I capacity while guaranteeing the reliability of all V2V links. We propose a graph- based algorithm that uses graph partitioning tools to divide highly interfering V2V links into different clusters before formulating the spectrum sharing problem as a weighted 3-dimensional matching problem, which is then solved through adapting a high-performance approximation algorithm. 
    more » « less
  5. Industrial Internet of Things (IIoT) has been shown to be of great value to the deployment of smart industrial environment. With the immense growth of IoT devices, dynamic spectrum sharing is introduced, envisaged as a promising solution to the spectrum shortage in IIoT. Meanwhile, cyber-physical safety issue remains to be a great concern for the reliable operation of IIoT system. In this paper, we consider the dynamic spectrum access in IIoT under a Received Signal Strength (RSS) based adversarial localization attack. We employ a practical and effective power perturbation approach to mitigate the localization threat on the IoT devices and cast the privacy-preserving spectrum sharing problem as a stochastic channel selection game. To address the randomness induced by the power perturbation approach, we develop a two-timescale distributed learning algorithm that converges almost surely to the set of correlated equilibria of the game. The numerical results show the convergence of the algorithm and corroborate that the design of two-timescale learning process effectively alleviates the network throughput degradation brought by the power perturbation procedure. 
    more » « less