skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Team-PSRO for Learning Approximate TMECor in Large Team Games via Cooperative Reinforcement Learning
Award ID(s):
2312342
PAR ID:
10517617
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
NeurIPS
Date Published:
Format(s):
Medium: X
Location:
New Orleans, LA
Sponsoring Org:
National Science Foundation
More Like this
  1. Recent algorithms have achieved superhuman performance at a number of twoplayer zero-sum games such as poker and go. However, many real-world situations are multi-player games. Zero-sum two-team games, such as bridge and football, involve two teams where each member of the team shares the same reward with every other member of that team, and each team has the negative of the reward of the other team. A popular solution concept in this setting, called TMECor, assumes that teams can jointly correlate their strategies before play, but are not able to communicate during play. This setting is harder than two-player zerosum games because each player on a team has different information and must use their public actions to signal to other members of the team. Prior works either have game-theoretic guarantees but only work in very small games, or are able to scale to large games but do not have game-theoretic guarantees. In this paper we introduce two algorithms: Team-PSRO, an extension of PSRO from twoplayer games to team games, and Team-PSRO Mix-and-Match which improves upon Team PSRO by better using population policies. In Team-PSRO, in every iteration both teams learn a joint best response to the opponent’s meta-strategy via reinforcement learning. As the reinforcement learning joint best response approaches the optimal best response, Team-PSRO is guaranteed to converge to a TMECor. In experiments on Kuhn poker and Liar’s Dice, we show that a tabular version of Team-PSRO converges to TMECor, and a version of Team PSRO using deep cooperative reinforcement learning beats self-play reinforcement learning in the large game of Google Research Football. 
    more » « less
  2. The purpose of the Digitally-Mediated Team Learning Workshop (sponsored by the National Science Foundation through a Dear Colleague Letter [NSF 18-017] via grant 1825007) was to ascertain the current state of the field and future research approaches for DMTL delivered through synchronous modalities in STEM classrooms for students in upper elementary grades through college. The overarching question for the workshop was: “How can we advance effective and scalable digital environments for synchronous team-based learning involving problem-solving and design activities within STEM classrooms for all learners?” The workshop explored the state of the field and future directions of DMTL through its four tracks: (a) student-facing and instructor-facing tools, (b) learning analytics, (c) pedagogical and andragogical strategies, and (d) inclusivity. 
    more » « less