skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Zero-Sum Games between Mean-Field Teams: Reachability-Based Analysis under Mean-Field Sharing
This work studies the behaviors of two large-population teams competing in a discrete environment. The team-level interactions are modeled as a zero-sum game while the agent dynamics within each team is formulated as a collaborative mean-field team problem. Drawing inspiration from the mean-field literature, we first approximate the large-population team game with its infinite-population limit. Subsequently, we construct a fictitious centralized system and transform the infinite-population game to an equivalent zero-sum game between two coordinators. Via a novel reachability analysis, we study the optimality of coordination strategies, which induce decentralized strategies under the original information structure. The optimality of the resulting strategies is established in the original finite-population game, and the theoretical guarantees are verified by numerical examples.  more » « less
Award ID(s):
1849130
PAR ID:
10515363
Author(s) / Creator(s):
; ;
Publisher / Repository:
AAAI
Date Published:
Journal Name:
Proceedings of the AAAI Conference on Artificial Intelligence
Volume:
38
Issue:
9
ISSN:
2159-5399
Page Range / eLocation ID:
9731 to 9739
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    In this paper, zero-sum mean-field type games (ZSMFTG) with linear dynamics and quadratic cost are studied under infinite-horizon discounted utility function. ZSMFTG are a class of games in which two decision makers whose utilities sum to zero, compete to influence a large population of indistinguishable agents. In particular, the case in which the transition and utility functions depend on the state, the action of the controllers, and the mean of the state and the actions, is investigated. The optimality conditions of the game are analysed for both open-loop and closed-loop controls, and explicit expressions for the Nash equilibrium strategies are derived. Moreover, two policy optimization methods that rely on policy gradient are proposed for both model-based and sample-based frameworks. In the model-based case, the gradients are computed exactly using the model, whereas they are estimated using Monte-Carlo simulations in the sample-based case. Numerical experiments are conducted to show the convergence of the utility function as well as the two players' controls. 
    more » « less
  2. This paper is concerned with two-person mean-field linear-quadratic non-zero sum stochastic differential games in an infinite horizon. Both open-loop and closed-loop Nash equilibria are introduced. The existence of an open-loop Nash equilibrium is characterized by the solvability of a system of mean-field forward-backward stochastic differential equations in an infinite horizon and the convexity of the cost functionals, and the closed-loop representation of an open-loop Nash equilibrium is given through the solution to a system of two coupled non-symmetric algebraic Riccati equations. The existence of a closed-loop Nash equilibrium is characterized by the solvability of a system of two coupled symmetric algebraic Riccati equations. Two-person mean-field linear-quadratic zero-sum stochastic differential games in an infinite horizon are also considered. Both the existence of open-loop and closed-loop saddle points are characterized by the solvability of a system of two coupled generalized algebraic Riccati equations with static stabilizing solutions. Mean-field linear-quadratic stochastic optimal control problems in an infinite horizon are discussed as well, for which it is proved that the open-loop solvability and closed-loop solvability are equivalent. 
    more » « less
  3. The theory of mean field games is a tool to understand noncooperative dynamic stochastic games with a large number of players. Much of the theory has evolved under conditions ensuring uniqueness of the mean field game Nash equilibrium. However, in some situations, typically involving symmetry breaking, non-uniqueness of solutions is an essential feature. To investigate the nature of non-unique solutions, this paper focuses on the technically simple setting where players have one of two states, with continuous time dynamics, and the game is symmetric in the players, and players are restricted to using Markov strategies. All the mean field game Nash equilibria are identified for a symmetric follow the crowd game. Such equilibria correspond to symmetric $$\epsilon$$-Nash Markov equilibria for $$N$$ players with $$\epsilon$$ converging to zero as $$N$$ goes to infinity. In contrast to the mean field game, there is a unique Nash equilibrium for finite $N.$ It is shown that fluid limits arising from the Nash equilibria for finite $$N$$ as $$N$$ goes to infinity are mean field game Nash equilibria, and evidence is given supporting the conjecture that such limits, among all mean field game Nash equilibria, are the ones that are stable fixed points of the mean field best response mapping. 
    more » « less
  4. Abstract The ideal free distribution in ecology was introduced by Fretwell and Lucas to model the habitat selection of animal populations. In this paper, we revisit the concept via a mean field game system with local coupling, which models a dynamic version of the habitat selection game in ecology. We establish the existence of classical solution of the ergodic mean field game system, including the case of heterogeneous diffusion when the underlying domain is one-dimensional and further show that the population density of agents converges to the ideal free distribution of the underlying habitat selection game, as the cost of control tends to zero. Our analysis provides a derivation of ideal free distribution in a dynamical context. 
    more » « less
  5. Recent algorithms have achieved superhuman performance at a number of twoplayer zero-sum games such as poker and go. However, many real-world situations are multi-player games. Zero-sum two-team games, such as bridge and football, involve two teams where each member of the team shares the same reward with every other member of that team, and each team has the negative of the reward of the other team. A popular solution concept in this setting, called TMECor, assumes that teams can jointly correlate their strategies before play, but are not able to communicate during play. This setting is harder than two-player zerosum games because each player on a team has different information and must use their public actions to signal to other members of the team. Prior works either have game-theoretic guarantees but only work in very small games, or are able to scale to large games but do not have game-theoretic guarantees. In this paper we introduce two algorithms: Team-PSRO, an extension of PSRO from twoplayer games to team games, and Team-PSRO Mix-and-Match which improves upon Team PSRO by better using population policies. In Team-PSRO, in every iteration both teams learn a joint best response to the opponent’s meta-strategy via reinforcement learning. As the reinforcement learning joint best response approaches the optimal best response, Team-PSRO is guaranteed to converge to a TMECor. In experiments on Kuhn poker and Liar’s Dice, we show that a tabular version of Team-PSRO converges to TMECor, and a version of Team PSRO using deep cooperative reinforcement learning beats self-play reinforcement learning in the large game of Google Research Football. 
    more » « less