Model-Based Reinforcement Learning for Offline Zero-Sum Markov Games

Yan, Yuling; Li, Gen; Chen, Yuxin; Fan, Jianqing

doi:10.1287/opre.2022.0342

Citation Details

Model-Based Reinforcement Learning for Offline Zero-Sum Markov Games

This paper makes progress toward learning Nash equilibria in two-player, zero-sum Markov games from offline data. Despite a large number of prior works tackling this problem, the state-of-the-art results suffer from the curse of multiple agents in the sense that their sample complexity bounds scale linearly with the total number of joint actions. The current paper proposes a new model-based algorithm, which provably finds an approximate Nash equilibrium with a sample complexity that scales linearly with the total number of individual actions. This work also develops a matching minimax lower bound, demonstrating the minimax optimality of the proposed algorithm for a broad regime of interest. An appealing feature of the result lies in algorithmic simplicity, which reveals the unnecessity of sophisticated variance reduction and sample splitting in achieving sample optimality. more »

Award ID(s):: 2210833 2053832 2052926

PAR ID:: 10593070

Author(s) / Creator(s):: Yan, Yuling; Li, Gen; Chen, Yuxin; Fan, Jianqing

Publisher / Repository:: INFORMS

Date Published:: 2024-11-01

Journal Name:: Operations Research

Volume:: 72

Issue:: 6

ISSN:: 0030-364X

Page Range / eLocation ID:: 2430-2445

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1287/opre.2022.0342

More Like this