Macro-Action-Based Deep Multi-Agent Reinforcement Learning

Xiao, Yuchen; Hoffman, Joshua; Amato, Christopher

Citation Details

In real-world multi-robot systems, performing high-quality, collaborative behaviors requires robots to asynchronously reason about high-level action selection at varying time durations. Macro-Action Decentralized Partially Observable Markov Decision Processes (MacDec-POMDPs) provide a general framework for asynchronous decision making under uncertainty in fully cooperative multi-agent tasks. However, multi-agent deep reinforcement learning methods have only been developed for (synchronous) primitive-action problems. This paper proposes two Deep Q-Network (DQN) based methods for learning decentralized and centralized macro-action-value functions with novel macro-action trajectory replay buffers introduced for each case. Evaluations on benchmark problems and a larger domain demonstrate the advantage of learning with macro-actions over primitive-actions and the scalability of our approaches. more »

Award ID(s):: 1734497

PAR ID:: 10167549

Author(s) / Creator(s):: Xiao, Yuchen; Hoffman, Joshua; Amato, Christopher

Date Published:: 2019-10-01

Journal Name:: Conference on Robot Learning

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this