skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: On Enhancing WiGig Communications With A UAV-Mounted RIS System: A Contextual Multi-Armed Bandit Approach
Recently emerging WiGig systems experience limited coverage and signal strength fluctuations due to strict line-of-sight (LoS) connectivity requirements. In this paper, we address these shortcomings of WiGig communication by exploiting two emerging technologies in tandem, namely the reconfigurable intelligent surface (RIS) and unmanned aerial vehicles (UAVs). In ultra-dense traffic sites (referred to as hotspots) where WiGig nodes or User Devices (UDs) experience complex propagation and non-line-of-sight (non-LoS) environment, we envision the deployment of a UAV-mounted RIS system to complement the WiGig base station (WGBS) to deliver services to the UDs. However, commercially available UAVs have limited energy (i.e., constrained flight time). Therefore, the trajectory of our considered UAV needs to be locally estimated to enable it to serve multiple hotspots while minimizing its energy consumption within the WGBS coverage boundaries. Since this tradeoff problem is computationally expensive for the resource-constrained UAV, we argue that sequential learning can be a lightweight yet effective solution to locally solve the problem with a low impact on the available energy on the UAV. We formally formulate this problem as a contextual multi-armed bandit (CMAB) game. Then, we develop the linear randomized upper confidence bound (Lin-RUCB) algorithm to solve the problem effectively. We regard the UAV as the bandit learner, which attempts to maximize its attainable rate (i.e., the reward) by serving distinct hotspots in its trajectory that we treat as the arms of the considered bandit. The context is defined as the hotspots’ locations provided using GPS (global positioning system) service and the reward history of each hotspot. Our proposal accounts for the energy expenditure of the UAV in moving from one hotspot to another within its battery charge lifetime. We evaluate the performance of our proposal via extensive simulations that exhibit the superiority of our proposed.  more » « less
Award ID(s):
2210252
PAR ID:
10516301
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
IEEE
Date Published:
ISBN:
978-1-6654-6483-3
Page Range / eLocation ID:
1 to 7
Subject(s) / Keyword(s):
Performance evaluation Games Benchmark testing Autonomous aerial vehicles Real-time systems Trajectory Proposals WiGig RIS UAV MAB Lin-RUCB
Format(s):
Medium: X
Location:
Toronto, ON, Canada
Sponsoring Org:
National Science Foundation
More Like this
  1. Unmanned aerial vehicle (UAV) technology is a rapidly growing field with tremendous opportunities for research and applications. To achieve true autonomy for UAVs in the absence of remote control, external navigation aids like global navigation satellite systems and radar systems, a minimum energy trajectory planning that considers obstacle avoidance and stability control will be the key. Although this can be formulated as a constrained optimization problem, due to the complicated non-linear relationships between UAV trajectory and thrust control, it is almost impossible to be solved analytically. While deep reinforcement learning is known for its ability to provide model free optimization for complex system through learning, its state space, actions and reward functions must be designed carefully. This paper presents our vision of different layers of autonomy in a UAV system, and our effort in generating and tracking the trajectory both using deep reinforcement learning (DRL). The experimental results show that compared to conventional approaches, the learned trajectory will need 20% less control thrust and 18% less time to reach the target. Furthermore, using the control policy learning by DRL, the UAV will achieve 58.14% less position error and 21.77% less system power. 
    more » « less
  2. In urban environments, tall buildings or structures can pose limits on the direct channel link between a base station (BS) and an Internet-of-Thing device (IoTD) for wireless communication. Unmanned aerial vehicles (UAVs) with a mounted reconfigurable intelligent surface (RIS), denoted as UAV-RIS, have been introduced in recent works to enhance the system throughput capacity by acting as a relay node between the BS and the IoTDs in wireless access networks. Uncoordinated UAVs or RIS phase shift elements will make unnecessary adjustments that can significantly impact the signal transmission to IoTDs in the area. The concept of age of information (AoI) is proposed in wireless network research to categorize the freshness of the received update message. To minimize the average sum of AoI (ASoA) in the network, two model-free deep reinforcement learning (DRL) approaches – Off-Policy Deep Q-Network (DQN) and On-Policy Proximal Policy Optimization (PPO) – are developed to solve the problem by jointly optimizing the RIS phase shift, the location of the UAV-RIS, and the IoTD transmission scheduling for large-scale IoT wireless networks. Analysis of loss functions and extensive simulations is performed to compare the stability and convergence performance of the two algorithms. The results reveal the superiority of the On-Policy approach, PPO, over the Off-Policy approach, DQN, in terms of stability, convergence speed, and under diverse environment settings 
    more » « less
  3. Reconfigurable intelligent surfaces (RISs) have been proposed to increase coverage in millimeter-wave networks by providing an indirect path from transmitter to receiver when the line-of-sight (LoS) path is blocked. In this paper, the problem of optimizing the locations and orientations of multiple RISs is considered for the first time. An iterative coverage expansion algorithm based on gradient descent is proposed for indoor scenarios where obstacles are present. The goal of this algorithm is to maximize coverage within the shadowed regions where there is no LoS path to the access point. The algorithm is guaranteed to converge to a local coverage maximum and is combined with an intelligent initialization procedure to improve the performance and efficiency of the approach. Numerical results demonstrate that, in dense obstacle environments, the proposed algorithm doubles coverage compared to a solution without RISs and provides about a 10% coverage increase compared to a brute force sequential RIS placement approach. 
    more » « less
  4. We study downlink transmission in a multi-band heterogeneous network comprising unmanned aerial vehicle (UAV) small base stations and ground-based dual mode mmWave small cells within the coverage area of a microwave (μW) macro base station. We formulate a two-layer optimization framework to simultaneously find efficient coverage radius for the UAVs and energy efficient radio resource management for the network, subject to minimum quality-of-service (QoS) and maximum transmission power constraints. The outer layer derives an optimal coverage radius/height for each UAV as a function of the maximum allowed path loss. The inner layer formulates an optimization problem to maximize the system energy efficiency (EE), defined as the ratio between the aggregate user data rate delivered by the system and its aggregate energy consumption (downlink transmission and circuit power). We demonstrate that at certain values of the target SINR τ introducing the UAV base stations doubles the EE. We also show that an increase in τ beyond an optimal EE point decreases the EE. 
    more » « less
  5. A framework for autonomous waypoint planning, trajectory generation through waypoints, and trajectory tracking for multi-rotor unmanned aerial vehicles (UAVs) is proposed in this work. Safe and effective operations of these UAVs is a problem that demands obstacle avoidance strategies and advanced trajectory planning and control schemes for stability and energy efficiency. To address this problem, a two-level optimization strategy is used for trajectory generation, then the trajectory is tracked in a stable manner. The framework given here consists of the following components: (a) a deep reinforcement learning (DRL)-based algorithm for optimal waypoint planning while minimizing control energy and avoiding obstacles in a given environment; (b) an optimal, smooth trajectory generation algorithm through waypoints, that minimizes a combinaton of velocity, acceleration, jerk and snap; and (c) a stable tracking control law that determines a control thrust force for an UAV to track the generated trajectory. 
    more » « less