Stochastic regret minimization in extensive-form games

Farina, G.; Kroer, C.; Sandholm, T.

Citation Details

Monte-Carlo counterfactual regret minimization (MCCFR) is the state-of-the-art algorithm for solving sequential games that are too large for full tree traversals. It works by using gradient es- timates that can be computed via sampling. How- ever, stochastic methods for sequential games have not been investigated extensively beyond MCCFR. In this paper we develop a new frame- work for developing stochastic regret minimiza- tion methods. This framework allows us to use any regret-minimization algorithm, coupled with any gradient estimator. The MCCFR algorithm can be analyzed as a special case of our frame- work, and this analysis leads to significantly stronger theoretical guarantees on convergence, while simultaneously yielding a simplified proof. Our framework allows us to instantiate several new stochastic methods for solving sequential games. We show extensive experiments on five games, where some variants of our methods out- perform MCCFR. more »

Award ID(s):: 1901403

PAR ID:: 10289304

Author(s) / Creator(s):: Farina, G.; Kroer, C.; Sandholm, T.

Date Published:: 2020-01-01

Journal Name:: ICML

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this