Multi-band operation in wireless networks can improve data rates by leveraging the benefits of propagation in different frequency ranges. Distinctive beam management procedures in different bands complicate band assignment because they require considering not only the channel quality but also the associated beam management overhead. Reinforcement learning (RL) is a promising approach for multi-band operation as it enables the system to learn and adjust its behavior through environmental feedback. In this paper, we formulate a sequential decision problem to jointly perform band assignment and beam management. We propose a method based on hierarchical RL (HRL) to handle the complexity of the problem by separating the policies for band selection and beam management. We evaluate the proposed HRL-based algorithm on a realistic channel generated based on ray-tracing simulators. Our results show that the proposed approach outperforms traditional RL approaches in terms of reduced beam training overhead and increased data rates under a realistic vehicular channel.
more »
« less
Joint Relay Selection and Beam Management Based on Deep Reinforcement Learning for Millimeter Wave Vehicular Communication
Cooperative relays improve reliability and coverage in wireless networks by providing multiple paths for data transmission. Relaying will play an essential role in vehicular networks at higher frequency bands, where mobility and frequent signal blockages cause link outages. To ensure connectivity in a relay-aided vehicular network, the relay selection policy should be designed to efficiently find unblocked relays. Inspired by recent advances in beam management in mobile millimeter wave (mmWave) networks, this paper address the question: how can the best relay be selected with minimal overhead from beam management? In this regard, we formulate a sequential decision problem to jointly optimize relay selection and beam management. We propose a joint relay selection and beam management policy based on deep reinforcement learning (DRL) using the Markov property of beam in- dices and beam measurements. The proposed DRL-based algorithm learns time-varying thresholds that adapt to the dynamic channel conditions and traffic patterns. Numeri- cal experiments demonstrate that the proposed algorithm outperforms baselines without prior channel knowledge. Moreover, the DRL-based algorithm can maintain high spectral efficiency under fast-varying channels.
more »
« less
- Award ID(s):
- 2153698
- PAR ID:
- 10496957
- Publisher / Repository:
- IEEE
- Date Published:
- Journal Name:
- IEEE Transactions on Vehicular Technology
- Volume:
- 72
- Issue:
- 10
- ISSN:
- 0018-9545
- Page Range / eLocation ID:
- 13067 to 13080
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Increasing data rate in wireless networks (e.g., vehicular ones) can be accomplished through a two-pronged approach, which are 1) increasing the network flow rate through parallel independent routes and 2) increasing the user's link rate through beamforming codebook adaptation. Mobile relays (e.g., mobile road side units) are utilized to enable achieving these goals given their flexible positioning. First at the network level, we model regularized Laplacian matrices, which are symmetric positive definite (SPD) ones representing relay-dependent network graphs, as points over Riemannian manifolds. Inspired by the geometric classification of different tasks in the brain network, Riemannian metrics, such as Log- Euclidean metric (LEM), are utilized to choose relay positions that result in maximum LEM. Simulation results show that the proposed LEM- based relay positioning algorithm enables parallel routes and achieves maximum network flow rate, as opposed to other conventional metrics (e.g., algebraic connectivity). Second at the link level, we propose an unsupervised geometric machine learning (G-ML) approach to learn the unique channel characteristics of each relay-dependent environment. Given that spatially-correlated fading channels have SPD covariance matrices, they can be represented over Riemannian manifolds. Consequently, LEM-based Riemannian metric is utilized for unsupervised learning of the environment channels, and a matched beamforming codebook is constructed accordingly. Simulation results show that the proposed G-ML model increases the link rate after a short training period.more » « less
-
In this paper, we consider the problem of constructing paths using decode and forward (DF) relays for millimeter wave (mmWave) backhaul communications in urban environments. Due to the large number of obstacles in urban environments, line-of-sight (LoS) wireless links, which are necessary for backhaul communication, often do not exist between small-cell base stations. To address this, some earlier works proposed creating multi-hop paths that use mmWave relay nodes with LoS communication between every pair of consecutive nodes to form logical links between base stations. We present algorithms, based on a novel widest-path formulation of the problem, for selecting decode and forward relay node locations in such paths. Our main algorithm is the first polynomial-time algorithm that constructs a relay path with a throughput that is proven to be the maximum possible. We also present variations of this algorithm for constrained problems in which: 1) each possible relay location can host only one relay node, and 2) minimizing the number of hops in the relay path is also an objective. For all of the proposed algorithms, the achievable throughput and numbers of relays are evaluated through simulation based on a 3-D model of a section of downtown Atlanta. The results show that, over a large number of random cases, our algorithm can always find paths with very high throughput using a small number of relays. We also compare and contrast the results with our earlier work that studied the use of amplify-and-forward (AF) relays for the same scenario.more » « less
-
Abstract: Impairments such as time varying phase noise (PHN) and carrier frequency offset (CFO) result in loss of synchronization and poor performance of multi-relay communication systems. Joint estimation of these impairments is necessary in order to correctly decode the received signal at the destination. In this paper, we address spectrally efficient multi-relay transmission scenarios where all the relays simultaneously communicate with the destination. We propose an iterative pilot-aided algorithm based on the expectation conditional maximization for joint estimation of multipath channels, Wiener PHNs, and CFOs in decode-and-forward-based multi-relay orthogonal frequency division multiplexing systems. Next, a new expression of the hybrid Cramér-Rao lower bound (HCRB) for the multi-parameter estimation problem is derived. Finally, an iterative receiver based on an extended Kalman filter for joint data detection and PHN tracking is employed. Numerical results show that the proposed estimator outperforms existing algorithms and its mean square error performance is close to the derived HCRB at different signal-to-noise ratios for different PHN variances. In addition, the combined estimation algorithm and the iterative receiver can significantly improve average bit-error rate (BER) performance compared with existing algorithms. In addition, the BER performance of the proposed system is close to the ideal case of perfect channel impulse responses, PHNs, and CFOs estimation.more » « less
-
This paper studies MIMO relays with non-identical link coherence times, a frequently occurring condition when, e.g., the nodes in the relay channel do not all have the same mobility, or the scatterers around some nodes have different mobility compared with those around other nodes. Despite its practical relevance, this condition, known as coherence diversity, has not been studied in the relay channel. This paper studies the performance of MIMO relays and proposes efficient transmission strategies under coherence diversity. Since coherence times have a prominent impact on channel training, we do not assume channel state is available to the decoder for free; all channel training resources are accounted for in the calculations. A product superposition technique is employed at the source which allows a more efficient usage of degrees of freedom when the relay and the destination have different training requirements. Varying configurations of coherence times are studied. The interesting case where the different link coherence intervals are not a multiple of each other, and therefore the coherence intervals do not align, is studied. Relay scheduling is combined with the product superposition to obtain further gains in degrees of freedom. The impact of coherence diversity is further studied in the presence of multiple parallel relays.more » « less
An official website of the United States government

