skip to main content


This content will become publicly available on October 1, 2024

Title: Joint Relay Selection and Beam Management Based on Deep Reinforcement Learning for Millimeter Wave Vehicular Communication
Cooperative relays improve reliability and coverage in wireless networks by providing multiple paths for data transmission. Relaying will play an essential role in vehicular networks at higher frequency bands, where mobility and frequent signal blockages cause link outages. To ensure connectivity in a relay-aided vehicular network, the relay selection policy should be designed to efficiently find unblocked relays. Inspired by recent advances in beam management in mobile millimeter wave (mmWave) networks, this paper address the question: how can the best relay be selected with minimal overhead from beam management? In this regard, we formulate a sequential decision problem to jointly optimize relay selection and beam management. We propose a joint relay selection and beam management policy based on deep reinforcement learning (DRL) using the Markov property of beam in- dices and beam measurements. The proposed DRL-based algorithm learns time-varying thresholds that adapt to the dynamic channel conditions and traffic patterns. Numeri- cal experiments demonstrate that the proposed algorithm outperforms baselines without prior channel knowledge. Moreover, the DRL-based algorithm can maintain high spectral efficiency under fast-varying channels.  more » « less
Award ID(s):
2153698
NSF-PAR ID:
10496957
Author(s) / Creator(s):
; ;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
IEEE Transactions on Vehicular Technology
Volume:
72
Issue:
10
ISSN:
0018-9545
Page Range / eLocation ID:
13067 to 13080
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Increasing data rate in wireless networks (e.g., vehicular ones) can be accomplished through a two-pronged approach, which are 1) increasing the network flow rate through parallel independent routes and 2) increasing the user's link rate through beamforming codebook adaptation. Mobile relays (e.g., mobile road side units) are utilized to enable achieving these goals given their flexible positioning. First at the network level, we model regularized Laplacian matrices, which are symmetric positive definite (SPD) ones representing relay-dependent network graphs, as points over Riemannian manifolds. Inspired by the geometric classification of different tasks in the brain network, Riemannian metrics, such as Log- Euclidean metric (LEM), are utilized to choose relay positions that result in maximum LEM. Simulation results show that the proposed LEM- based relay positioning algorithm enables parallel routes and achieves maximum network flow rate, as opposed to other conventional metrics (e.g., algebraic connectivity). Second at the link level, we propose an unsupervised geometric machine learning (G-ML) approach to learn the unique channel characteristics of each relay-dependent environment. Given that spatially-correlated fading channels have SPD covariance matrices, they can be represented over Riemannian manifolds. Consequently, LEM-based Riemannian metric is utilized for unsupervised learning of the environment channels, and a matched beamforming codebook is constructed accordingly. Simulation results show that the proposed G-ML model increases the link rate after a short training period. 
    more » « less
  2. Abstract: Impairments such as time varying phase noise (PHN) and carrier frequency offset (CFO) result in loss of synchronization and poor performance of multi-relay communication systems. Joint estimation of these impairments is necessary in order to correctly decode the received signal at the destination. In this paper, we address spectrally efficient multi-relay transmission scenarios where all the relays simultaneously communicate with the destination. We propose an iterative pilot-aided algorithm based on the expectation conditional maximization for joint estimation of multipath channels, Wiener PHNs, and CFOs in decode-and-forward-based multi-relay orthogonal frequency division multiplexing systems. Next, a new expression of the hybrid Cramér-Rao lower bound (HCRB) for the multi-parameter estimation problem is derived. Finally, an iterative receiver based on an extended Kalman filter for joint data detection and PHN tracking is employed. Numerical results show that the proposed estimator outperforms existing algorithms and its mean square error performance is close to the derived HCRB at different signal-to-noise ratios for different PHN variances. In addition, the combined estimation algorithm and the iterative receiver can significantly improve average bit-error rate (BER) performance compared with existing algorithms. In addition, the BER performance of the proposed system is close to the ideal case of perfect channel impulse responses, PHNs, and CFOs estimation. 
    more » « less
  3. null (Ed.)
    Abstract: Radio access network (RAN) in 5G is expected to satisfy the stringent delay requirements of a variety of applications. The packet scheduler plays an important role by allocating spectrum resources to user equipments (UEs) at each transmit time interval (TTI). In this paper, we show that optimal scheduling is a challenging combinatorial optimization problem, which is hard to solve within the channel coherence time with conventional optimization methods. Rule-based scheduling methods, on the other hand, are hard to adapt to the time-varying wireless channel conditions and various data request patterns of UEs. Recently, integrating artificial intelligence (AI) into wireless networks has drawn great interest from both academia and industry. In this paper, we incorporate deep reinforcement learning (DRL) into the design of cellular packet scheduling. A delay-aware cell traffic scheduling algorithm is developed to map the observed system state to scheduling decision. Due to the huge state space, a recurrent neural network (RNN) is utilized to approximate the optimal action-policy function. Different from conventional rule-based scheduling methods, the proposed scheme can learn from the interactions with the environment and adaptively choosing the best scheduling decision at each TTI. Simulation results show that the DRL-based packet scheduling can achieve the lowest average delay compared with several conventional approaches. Meanwhile, the UEs' average queue lengths can also be significantly reduced. The developed method also exhibits great potential in real-time scheduling in delay-sensitive scenarios. 
    more » « less
  4. In urban environments, tall buildings or structures can pose limits on the direct channel link between a base station (BS) and an Internet-of-Thing device (IoTD) for wireless communication. Unmanned aerial vehicles (UAVs) with a mounted reconfigurable intelligent surface (RIS), denoted as UAV-RIS, have been introduced in recent works to enhance the system throughput capacity by acting as a relay node between the BS and the IoTDs in wireless access networks. Uncoordinated UAVs or RIS phase shift elements will make unnecessary adjustments that can significantly impact the signal transmission to IoTDs in the area. The concept of age of information (AoI) is proposed in wireless network research to categorize the freshness of the received update message. To minimize the average sum of AoI (ASoA) in the network, two model-free deep reinforcement learning (DRL) approaches – Off-Policy Deep Q-Network (DQN) and On-Policy Proximal Policy Optimization (PPO) – are developed to solve the problem by jointly optimizing the RIS phase shift, the location of the UAV-RIS, and the IoTD transmission scheduling for large-scale IoT wireless networks. Analysis of loss functions and extensive simulations is performed to compare the stability and convergence performance of the two algorithms. The results reveal the superiority of the On-Policy approach, PPO, over the Off-Policy approach, DQN, in terms of stability, convergence speed, and under diverse environment settings 
    more » « less
  5. This paper proposes a novel cognitive cooperative transmission scheme by exploiting massive multiple-input multiple-output (MMIMO) and non-orthogonal multiple access (NOMA) radio technologies, which enables a macrocell network and multiple cognitive small cells to cooperate in dynamic spectrum sharing. The macrocell network is assumed to own the spectrum band and be the primary network (PN), and the small cells act as the secondary networks (SNs). The secondary access points (SAPs) of the small cells can cooperatively relay the traffic for the primary users (PUs) in the macrocell network, while concurrently accessing the PUs’ spectrum to transmit their own data opportunistically through MMIMO and NOMA. Such cooperation creates a “win-win” situation: the throughput of PUs will be significantly increased with the help of SAP relays, and the SAPs are able to use the PU’s spectrum to serve their secondary users (SUs). The interplay of these advanced radio techniques is analyzed in a systematic manner, and a framework is proposed for the joint optimization of cooperative relay selection, NOMA and MMIMO transmit power allocation, and transmission scheduling. Further, to model network-wide cooperation and competition, a two-sided matching algorithm is designed to find the stable partnership between multiple SAPs and PUs. The evaluation results demonstrate that the proposed scheme achieves significant performance gains for both primary and secondary users, compared to the baselines. 
    more » « less