skip to main content


Title: Optimizing AoI in UAV-RIS Assisted IoT Networks: Off Policy vs. On Policy
In urban environments, tall buildings or structures can pose limits on the direct channel link between a base station (BS) and an Internet-of-Thing device (IoTD) for wireless communication. Unmanned aerial vehicles (UAVs) with a mounted reconfigurable intelligent surface (RIS), denoted as UAV-RIS, have been introduced in recent works to enhance the system throughput capacity by acting as a relay node between the BS and the IoTDs in wireless access networks. Uncoordinated UAVs or RIS phase shift elements will make unnecessary adjustments that can significantly impact the signal transmission to IoTDs in the area. The concept of age of information (AoI) is proposed in wireless network research to categorize the freshness of the received update message. To minimize the average sum of AoI (ASoA) in the network, two model-free deep reinforcement learning (DRL) approaches – Off-Policy Deep Q-Network (DQN) and On-Policy Proximal Policy Optimization (PPO) – are developed to solve the problem by jointly optimizing the RIS phase shift, the location of the UAV-RIS, and the IoTD transmission scheduling for large-scale IoT wireless networks. Analysis of loss functions and extensive simulations is performed to compare the stability and convergence performance of the two algorithms. The results reveal the superiority of the On-Policy approach, PPO, over the Off-Policy approach, DQN, in terms of stability, convergence speed, and under diverse environment settings  more » « less
Award ID(s):
1757207
NSF-PAR ID:
10418701
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
IEEE Internet of Things Journal
ISSN:
2372-2541
Page Range / eLocation ID:
1 to 1
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The unmanned aerial vehicle (UAV) is one of the technological breakthroughs that supports a variety of services, including communications. UAVs can also enhance the security of wireless networks. This paper defines the problem of eavesdropping on the link between the ground user and the UAV, which serves as an aerial base station (ABS). The reinforcement learning algorithms Q-learning and deep Q-network (DQN) are proposed for optimizing the position of the ABS and the transmission power to enhance the data rate of the ground user. This increases the secrecy capacity without the system knowing the location of the eavesdropper. Simulation results show fast convergence and the highest secrecy capacity of the proposed DQN compared to Q-learning and two baseline approaches. 
    more » « less
  2. In this work, we develop a two time-scale deep learning approach for beamforming and phase shift (BF-PS) design in time-varying RIS-aided networks. In contrast to most existing works that assume perfect CSI for BF-PS design, we take into account the cost of channel estimation and utilize Long Short-Term Memory (LSTM) networks to design BF-PS from limited samples of estimated channel CSI. An LSTM channel extrapolator is designed first to generate high resolution estimates of the cascaded BS-RIS-user channel from sampled signals acquired at a slow time scale. Subsequently, the outputs of the channel extrapolator are fed into an LSTM-based two stage neural network for the joint design of BF-PS at a fast time scale of per coherence time. To address the critical issue that training overhead increases linearly with the number of RIS elements, we consider various pilot structures and sampling patterns in time and space to evaluate the efficiency and sum-rate performance of the proposed two time-scale design. Our results show that the proposed two time-scale design can achieve good spectral efficiency when taking into account the pilot overhead required for training. The proposed design also outperforms a direct BF-PS design that does not employ a channel extrapolator. These demonstrate the feasibility of applying RIS in time-varying channels with reasonable pilot overhead. 
    more » « less
  3. UAVs need to communicate along three dimensions (3D) with other aerial vehicles, ranging from above to below, and often need to connect to ground stations. However, wireless transmission in 3D space significantly dissipates power, often hindering the range required for these types of links. Directional transmission is one way to efficiently use available wireless channels to achieve the desired range. While multiple-input multiple-output (MIMO) systems can digitally steer the beam through channel matrix manipulation without needing directional awareness, the power resources required for operating multiple radios on a UAV are often logistically challenging. An alternative approach to streamline resources is the use of phased arrays to achieve directionality in the analog domain, but this requires beam sweeping and results in search-time delay. The complexity and search time can increase with the dynamic mobility pattern of the UAVs in aerial networks. However, if the direction of the receiver is known at the transmitter, the search time can be significantly reduced. In this work, multi-antenna channels between two UAVs in A2A links are analyzed, and based on these findings, an efficient machine learning-based method for estimating the direction of a transmitting node using channel estimates of 4 antennas (2 × 2 MIMO) is proposed. The performance of the proposed method is validated and verified through in-field drone-to-drone measurements. Findings indicate that the proposed method can estimate the direction of the transmitter in the A2A link with 86% accuracy. Further, the proposed direction estimation method is deployable for UAV-based massive MIMO systems to select the directional beam without the need to sweep or search for optimal communication performance. 
    more » « less
  4. Existing adversarial algorithms for Deep Reinforcement Learning (DRL) have largely focused on identifying an optimal time to attack a DRL agent. However, little work has been explored in injecting efficient adversarial perturbations in DRL environments. We propose a suite of novel DRL adversarial attacks, called ACADIA, representing AttaCks Against Deep reInforcement leArning. ACADIA provides a set of efficient and robust perturbation-based adversarial attacks to disturb the DRL agent's decision-making based on novel combinations of techniques utilizing momentum, ADAM optimizer (i.e., Root Mean Square Propagation, or RMSProp), and initial randomization. These kinds of DRL attacks with novel integration of such techniques have not been studied in the existing Deep Neural Networks (DNNs) and DRL research. We consider two well-known DRL algorithms, Deep-Q Learning Network (DQN) and Proximal Policy Optimization (PPO), under Atari games and MuJoCo where both targeted and non-targeted attacks are considered with or without the state-of-the-art defenses in DRL (i.e., RADIAL and ATLA). Our results demonstrate that the proposed ACADIA outperforms existing gradient-based counterparts under a wide range of experimental settings. ACADIA is nine times faster than the state-of-the-art Carlini & Wagner (CW) method with better performance under defenses of DRL. 
    more » « less
  5. null (Ed.)
    To mitigate the long-term spectrum crunch problem, the FCC recently opened up the 6 GHz frequency band for unlicensed use. However, the existing spectrum sharing strategies cannot support the operation of access points in moving vehicles such as cars and UAVs. This is primarily because of the directionality-based spectrum sharing among the incumbent systems in this band and the high mobility of the moving vehicles, which together make it challenging to control the cross-system interference. In this paper we propose SwarmShare, a mobility-resilient spectrum sharing framework for swarm UAV networking in the 6 GHz band. We first present a mathematical formulation of the SwarmShare problem, where the objective is to maximize the spectral efficiency of the UAV network by jointly controlling the flight and transmission power of the UAVs and their association with the ground users, under the interference constraints of the incumbent system. We find that there are no closed-form mathematical models that can be used characterize the statistical behaviors of the aggregate interference from the UAVs to the incumbent system. Then we propose a data-driven three-phase spectrum sharing approach, including Initial Power Enforcement, Offline-dataset Guided Online Power Adaptation, and Reinforcement Learning-based UAV Optimization. We validate the effectiveness of SwarmShare through an extensive simulation campaign. Results indicate that, based on SwarmShare, the aggregate interference from the UAVs to the incumbent system can be effectively controlled below the target level without requiring the real-time cross-system channel state information. The mobility resilience of SwarmShare is also validated in coexisting networks with no precise UAV location information. 
    more » « less