Accelerating Model Free Reinforcement Learning with Imperfect Model Knowledge in Dynamic Spectrum Access

Li, Lianjun; Liu, Lingjia; Bai, Jianan; Chang, Hao-Hsuan; Chen, Hao; Ashdown, Jonathan D.; Zhang, Jianzhong; Yi, Yang

doi:10.1109/JIOT.2020.2988268

Citation Details

Accelerating Model Free Reinforcement Learning with Imperfect Model Knowledge in Dynamic Spectrum Access

Current studies that apply reinforcement learning (RL) to dynamic spectrum access (DSA) problems in wireless communications systems are mainly focusing on model-free RL. However, in practice model-free RL requires large number of samples to achieve good performance making it impractical in real time applications such as DSA. Combining model-free and model-based RL can potentially reduce the sample complexity while achieving similar level of performance as model-free RL as long as the learned model is accurate enough. However, in complex environment the learned model is never perfect. In this paper we combine model-free and model-based reinforcement learning, introduce an algorithm that can work with an imperfectly learned model to accelerate the model-free reinforcement learning. Results show our algorithm achieves higher sample efficiency than standard model-free RL algorithm and Dyna algorithm (a standard algorithm that integrating model-based and model-free RL) with much lower computation complexity than the Dyna algorithm. For the extreme case where the learned model is highly inaccurate, the Dyna algorithm performs even worse than the model-free RL algorithm while our algorithm can still outperform the model-free RL algorithm. more »

Award ID(s):: 1811497 1937487

PAR ID:: 10161733

Author(s) / Creator(s):: Li, Lianjun; Liu, Lingjia; Bai, Jianan; Chang, Hao-Hsuan; Chen, Hao; Ashdown, Jonathan D.; Zhang, Jianzhong; Yi, Yang

Date Published:: 2020-04-16

Journal Name:: IEEE Internet of Things Journal

ISSN:: 2372-2541

Page Range / eLocation ID:: 1 to 1

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1109/JIOT.2020.2988268

More Like this