skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on March 1, 2026

Title: Fair Dynamic Spectrum Access via Fully Decentralized Multi-Agent Reinforcement Learning
—We consider a decentralized wireless network with several source-destination pairs sharing a limited number of orthogonal frequency bands. Sources learn to adapt their transmissions (specifically, their band selection strategy) over time, in a decentralized manner, without sharing information with each other. Sources can only observe the outcome of their own transmissions (i.e., success or collision), having no prior knowledge of the network size or of the transmission strategy of other sources. The goal of each source is to maximize their own throughput while striving for network-wide fairness. We propose a novel fully decentralized Reinforcement Learning (RL)-based solution that achieves fairness without coordination. The proposed Fair Share RL(FSRL)solution combines: (i) state augmentation with a semiadaptive time reference; (ii) an architecture that leverages risk control and time difference likelihood; and (iii) a fairness-driven reward structure. We evaluate FSRL in more than 50 network settings with different number of agents, different amounts of available spectrum, in the presence of jammers, and in an ad-hoc setting. Simulation results suggest that, when we compare FSRL with a common baseline RL algorithm from the literature, FSRL can be up to 89.0% fairer (as measured by Jain’s fairness index) in stringent settings with several sources and a single frequency band, and 48.1% fairer on average.  more » « less
Award ID(s):
2148128
PAR ID:
10582239
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
arxiv.org/abs/2503.24296
Date Published:
Journal Name:
arXiv
ISSN:
arXiv.2503.24296
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider a decentralized wireless network with several source-destination pairs sharing a limited number of orthogonal frequency bands. Sources learn to adapt their transmissions (specifically, their band selection strategy) over time, in a decentralized manner, without sharing information with each other. Sources can only observe the outcome of their own transmissions (i.e., success or collision), having no prior knowledge of the network size or of the transmission strategy of other sources. The goal of each source is to maximize their own throughput while striving for network-wide fairness. We propose a novel fully decentralized Reinforcement Learning (RL)-based solution that achieves fairness without coordination. The proposed Fair Share RL (FSRL) solution combines: (i) state augmentation with a semi-adaptive time reference; (ii) an architecture that leverages risk control and time difference likelihood; and (iii) a fairnessdriven reward structure. We evaluate FSRL in several network settings. Simulation results suggest that, when we compare FSRL with a common baseline RL algorithm from the literature, FSRL can be up to 89.0% fairer (as measured by Jain’s fairness index) in stringent settings with several sources and a single frequency band, and 48.1% fairer on average. 
    more » « less
  2. Radio frequency identification (RFID) is a technology for automated identification of objects and people. RFID technology is expected to find extensive use in applications related to the Internet of Things, and in particular applications of Internet of Battlefield Things. Of particular interest are passive RFID tags due to a number of their salient advantages. Such tags, lacking energy sources of their own, use backscattering of the power of an RF source (a reader) to communicate. Recently, passive RFID tag-to-tag (T2T) communication has been demonstrated, via which tags can directly communicate with each other and share information. This opens the possibility of building a Network of Tags (NeTa), in which the passive tags communicate among themselves to perform data processing functions. Among possible applications of NeTa are monitoring services in hard-to-reach locations. As an essential step toward implementation of NeTa, we consider a novel multi-hop network architecture; in particular, with the proposed novel turbo backscattering operation, inter-tag distances can be significantly increased. Due to the interference among tags’ transmissions, one of the main technical challenges of implementing such the NeTa architecture is the routing protocol design. In this paper, we introduce a design of a routing protocol, which is based on a solution of a non-linear binary optimization problem. We study the performance of the proposed protocol and investigate impacts of several network factors, such as the tag density and the transmit power of the reader. 
    more » « less
  3. We consider the problem of controlling a set of dynamically decoupled plants where the plants' subcontrollers communicate with each other according to a fixed and known network topology. We assume the communication to be instantaneous but there is a fixed processing delay associated with incoming transmissions. We provide explicit closed-form expressions for the optimal decentralized controller under these communication constraints and using standard LQG assumptions for the plants and cost function. Although this problem is convex, it is challenging due to the irrationality of continuous-time delays and the decentralized information-sharing pattern. We show that the optimal subcontrollers each have an observer-regulator architecture containing LTI and FIR blocks and we characterize the signals that subcontrollers should transmit to each other across the network. 
    more » « less
  4. Opening up data produced by the Internet of Things (IoT) and mobile devices for public utilization can maximize their economic value. Challenges remain in the trustworthiness of the data sources and the security of the trading process, particularly when there is no trust between the data providers and consumers. In this paper, we propose DEXO, a decentralized data exchange mechanism that facilitates secure and fair data exchange between data consumers and distributed IoT/mobile data providers at scale, allowing the consumer to verify the data generation process and the providers to be compensated for providing authentic data, with correctness guarantees from the exchange platform. To realize this, DEXO extends the decentralized oracle network model that has been successful in the blockchain applications domain to incorporate novel hardware-cryptographic co-design that harmonizes trusted execution environment, secret sharing, and smart contract-assisted fair exchange. For the first time, DEXO ensures end-to-end data confidentiality, source verifiability, and fairness of the exchange process with strong resilience against participant collusion. We implemented a prototype of the DEXO system to demonstrate feasibility. The evaluation shows a moderate deployment cost and significantly improved blockchain operation efficiency compared to a popular data exchange mechanism. 
    more » « less
  5. We consider the problem of spectrum sharing by multiple cellular operators. We propose a novel deep Reinforcement Learning (DRL)-based distributed power allocation scheme which utilizes the multi-agent Deep Deterministic Policy Gradient (MA-DDPG) algorithm. In particular, we model the base stations (BSs) that belong to the multiple operators sharing the same band, as DRL agents that simultaneously determine the transmit powers to their scheduled user equipment (UE) in a synchronized manner. The power decision of each BS is based on its own observation of the radio environment (RF) environment, which consists of interference measurements reported from the UEs it serves, and a limited amount of information obtained from other BSs. One advantage of the proposed scheme is that it addresses the single-agent non-stationarity problem of RL in the multi-agent scenario by incorporating the actions and observations of other BSs into each BS's own critic which helps it to gain a more accurate perception of the overall RF environment. A centralized-training-distributed-execution framework is used to train the policies where the critics are trained over the joint actions and observations of all BSs while the actor of each BS only takes the local observation as input in order to produce the transmit power. Simulation with the 6 GHz Unlicensed National Information Infrastructure (U-NII)-5 band shows that the proposed power allocation scheme can achieve better throughput performance than several state-of-the-art approaches. 
    more » « less