 Award ID(s):
 1901403
 Publication Date:
 NSFPAR ID:
 10294921
 Journal Name:
 NeurIPS Workshop on Cooperative AI
 Sponsoring Org:
 National Science Foundation
More Like this

We focus on the problem of finding an optimal strategy for a team of two players that faces an opponent in an imperfectinformation zerosum extensiveform game. Team members are not allowed to communicate during play but can coordinate before the game. In that setting, it is known that the best the team can do is sample a profile of potentially randomized strategies (one per player) from a joint (a.k.a. correlated) probability distribution at the beginning of the game. In this paper, we first provide new modeling results about computing such an optimal distribution by drawing a connection to a different literature on extensiveform correlation. Second, we provide an algorithm that computes such an optimal distribution by only using profiles where only one of the team members gets to randomize in each profile. We can also cap the number of such profiles we allow in the solution. This begets an anytime algorithm by increasing the cap. We find that often a handful of wellchosen such profiles suffices to reach optimal utility for the team. This enables team members to reach coordination through a relatively simple and understandable plan. Finally, inspired by this observation and leveraging theoretical concepts that we introduce, wemore »

We focus on the problem of finding an optimal strategy for a team of players that faces an opponent in an imperfectinformation zerosum extensiveform game. Team members are not allowed to communicate during play but can coordinate before the game. In this setting, it is known that the best the team can do is sample a profile of potentially randomized strategies (one per player) from a joint (a.k.a. correlated) probability distribution at the beginning of the game. In this paper, we first provide new modeling results about computing such an optimal distribution by drawing a connection to a different literature on extensiveform correlation. Second, we provide an algorithm that allows one for capping the number of profiles employed in the solution. This begets an anytime algorithm by increasing the cap. We find that often a handful of wellchosen such profiles suffices to reach optimal utility for the team. This enables team members to reach coordination through a simple and understandable plan. Finally, inspired by this observation and leveraging theoretical concepts that we introduce, we develop an efficient columngeneration algorithm for finding an optimal distribution for the team. We evaluate it on a suite of common benchmark games. It is threemore »

Oftenâ€”for example in war games, strategy video games, and financial simulationsâ€”the game is given to us only as a blackbox simulator in which we can play it. In these settings, since the game may have unknown nature action distributions (from which we can only obtain samples) and/or be too large to expand fully, it can be difficult to compute strategies with guarantees on exploitability. Recent work (Zhang and Sandholm 2020) resulted in a notion of certificate for extensiveform games that allows exploitability guarantees while not expanding the full game tree. However, that work assumed that the black box could sample or expand arbitrary nodes of the game tree at any time, and that a series of exact game solves (via, for example, linear programming) can be conducted to compute the certificate. Each of those two assumptions severely restricts the practical applicability of that method. In this work, we relax both of the assumptions. We show that highprobability certificates can be obtained with a black box that can do nothing more than play through games, using only a regret minimizer as a subroutine. As a bonus, we obtain an equilibriumfinding algorithm with ~O (1= p T) convergence rate in the extensiveformmore »

Although security games have attracted intensive research attention over the past years, few existing works consider how information from local communities would affect the game. In this paper, we introduce a new player  a strategic informant, who can observe and report upcoming attacks  to the defenderattacker security game setting. Characterized by a private type, the informant has his utility structure that leads to his strategic behaviors. We model the game as a 3player extensiveform game and propose a novel solution concept of Strong Stackelbergperfect Bayesian equilibrium. To compute the optimal defender strategy, we first show that although the informant can have infinitely many types in general, the optimal defense plan can only include a finite (exponential) number of different patrol strategies. We then prove that there exists a defense plan with only a linear number of patrol strategies that achieve the optimal defender's utility, which significantly reduces the computational burden and allows us to solve the game in polynomial time using linear programming. Finally, we conduct extensive experiments to show the effect of the strategic informant and demonstrate the effectiveness of our algorithm.

Hansen, Kristoffer Arnsfelt ; Liu, Tracy Xiao ; Malekian, Azarakhsh (Ed.)Empirical gametheoretic analysis (EGTA) is a general framework for reasoning about complex games using agentbased simulation. Data from simulating select strategy profiles is employed to estimate a cogent and tractable game model approximating the underlying game. To date, EGTA methodology has focused on game models in normal form; though the simulations play out in sequential observations and decisions over time, the game model abstracts away this temporal structure. Richer models of extensiveform games (EFGs) provide a means to capture temporal patterns in action and information, using tree representations. We propose treeexploiting EGTA (TEEGTA), an approach to incorporate EFG models into EGTA. TEEGTA constructs game models that express observations and temporal organization of activity, albeit at a coarser grain than the underlying agentbased simulation model. The idea is to exploit key structure while maintaining tractability. We establish theoretically and experimentally that exploiting even a little temporal structure can vastly reduce estimation error in strategyprofile payoffs compared to the normalform model. Further, we explore the implications of EFG models for iterative approaches to EGTA, where strategy spaces are extended incrementally. Our experiments on several game instances demonstrate that TEEGTA can also improve performance in the iterative setting, as measured by the qualitymore »