DeepMellow: Removing the Need for a Target Network in Deep Q-Learning

Kim, S; Asadi, K; Littman, M; Konidaris, G

doi:10.24963/ijcai.2019/379

Citation Details

DeepMellow: Removing the Need for a Target Network in Deep Q-Learning

Deep Q-Network (DQN) is an algorithm that achieves human-level performance in complex domains like Atari games. One of the important elements of DQN is its use of a target network, which is necessary to stabilize learning. We argue that using a target network is incompatible with online reinforcement learning, and it is possible to achieve faster and more stable learning without a target network when we use Mellowmax, an alternative softmax operator. We derive novel properties of Mellowmax, and empirically show that the combination of DQN and Mellowmax, but without a target network, outperforms DQN with a target network. more »

Award ID(s):: 1717569

PAR ID:: 10180061

Author(s) / Creator(s):: Kim, S; Asadi, K; Littman, M; Konidaris, G

Date Published:: 2019-08-15

Journal Name:: Proceedings of the Twenty Eighth International Joint Conference on Artificial Intelligence

Page Range / eLocation ID:: 2733-2739

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.24963/ijcai.2019/379

More Like this