A mediator observes no-regret learners playing an extensive-form game repeatedly across T rounds. The mediator attempts to steer players toward some desirable predetermined equilibrium by giving (nonnegative) payments to players. We call this the steering problem. The steering problem captures problems several problems of interest, among them equilibrium selection and information design (persuasion). If the mediator’s budget is unbounded, steering is trivial because the mediator can simply pay the players to play desirable actions. We study two bounds on the mediator’s payments: a total budget and a per-round budget. If the mediator’s total budget does not grow with T, we show that steering is impossible. However, we show that it is enough for the total budget to grow sublinearly with T, that is, for the average payment to vanish. When players’ full strategies are observed at each round, we show that constant per-round budgets permit steering. In the more challenging setting where only trajectories through the game tree are observable, we show that steering is impossible with constant per-round budgets in general extensive-form games, but possible in normal-form games or if the per-round budget may itself depend on T. We also show how our results can be generalized to the case when the equilibrium is being computed online while steering is happening. We supplement our theoretical positive results with experiments highlighting the efficacy of steering in large games.
more »
« less
Implementing Mediators with Asynchronous Cheap Talk
A mediator can help non-cooperative agents obtain an equilibrium that may otherwise not be possible. We study the ability of players to obtain the same equilibrium without a mediator, using only cheap talk, that is, nonbinding pre-play communication. Previous work has considered this problem in a synchronous setting. Here we consider the effect of asynchrony on the problem, and provide upper bounds for implementing mediators. Considering asynchronous environments introduces new subtleties, including exactly what solution concept is most appropriate and determining what move is played if the cheap talk goes on forever. Different results are obtained depending on whether the move after such ``infinite play'' is under the control of the players or part of the description of the game.
more »
« less
- Award ID(s):
- 1703846
- PAR ID:
- 10156231
- Date Published:
- Journal Name:
- Proceedings of the 38th Annual ACM Symposium on Principles of Distributed Computing
- Page Range / eLocation ID:
- 501 to 510
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The literature on strategic communication originated with the influential cheap talk model, which precedes the Bayesian persuasion model by three decades. This model describes an interaction between two agents: sender and receiver. The sender knows some state of the world which the receiver does not know, and tries to influence the receiver’s action by communicating a cheap talk message to the receiver. This paper initiates the algorithmic study of cheap talk in a finite environment (i.e., a finite number of states and receiver’s possible actions). We first prove that approximating the sender-optimal or the welfare-maximizing cheap talk equilibrium up to a certain additive constant or multiplicative factor is NP-hard. Fortunately, we identify three naturally-restricted cases that admit efficient algorithms for finding a sender-optimal equilibrium. These include a state-independent sender’s utility structure, a constant number of states or a receiver having only two actions.more » « less
-
Abraham, Dolev, Geffner, and Halpern [ 1 ] proved that, in asynchronous systems, a (k, t)-robust equilibrium for n players and a trusted mediator can be implemented without the mediator as long as n > 4( k+t ), where an equilibrium is ( k, t )-robust if, roughly speaking, no coalition of t players can decrease the payoff of any of the other players, and no coalition of k players can increase their payoff by deviating. We prove that this bound is tight, in the sense that if n ≤ 4( k+t ) there exist ( k, t )-robust equilibria with a mediator that cannot be implemented by the players alone. Even though implementing ( k, t )-robust mediators seems closely related to implementing asynchronous multiparty ( k+t )-secure computation [ 6 ], to the best of our knowledge there is no known straightforward reduction from one problem to another. Nevertheless, we show that there is a non-trivial reduction from a slightly weaker notion of ( k+t )-secure computation, which we call ( k+t )-strict secure computation , to implementing ( k, t )-robust mediators. We prove the desired lower bound by showing that there are functions on n variables that cannot be ( k+t )-strictly securely computed if n ≤ 4( k+t ). This also provides a simple alternative proof for the well-known lower bound of 4 t +1 on asynchronous secure computation in the presence of up to t malicious agents [ 4 , 8 , 10 ].more » « less
-
null (Ed.)Justified communication equilibrium (JCE) is an equilibrium refinement for signaling games with cheap-talk communication. A strategy profile must be a JCE to be a stable outcome of nonequilibrium learning when receivers are initially trusting and senders play many more times than receivers. In the learning model, the counterfactual “speeches” that have been informally used to motivate past refinements are messages that are actually sent. Stable profiles need not be perfect Bayesian equilibria, so JCE sometimes preserves equilibria that existing refinements eliminate. Despite this, it resembles the earlier refinements D1 and NWBR, and it coincides with them in co-monotonic signaling games. (JEL C70, D82, D83, J23, M51)more » « less
-
null (Ed.)We study the following problem, which to our knowledge has been addressed only partially in the literature and not in full generality. An agent observes two players play a zero-sum game that is known to the players but not the agent. The agent observes the actions and state transitions of their game play, but not rewards. The players may play either op-timally (according to some Nash equilibrium) or according to any other solution concept, such as a quantal response equilibrium. Following these observations, the agent must recommend a policy for one player, say Player 1. The goal is to recommend a policy that is minimally exploitable un-der the true, but unknown, game. We take a Bayesian ap-proach. We establish a likelihood function based on obser-vations and the specified solution concept. We then propose an approach based on Markov chain Monte Carlo (MCMC), which allows us to approximately sample games from the agent’s posterior belief distribution. Once we have a batch of independent samples from the posterior, we use linear pro-gramming and backward induction to compute a policy for Player 1 that minimizes the sum of exploitabilities over these games. This approximates the policy that minimizes the ex-pected exploitability under the full distribution. Our approach is also capable of handling counterfactuals, where known modifications are applied to the unknown game. We show that our Bayesian MCMC-based technique outperforms two other techniques—one based on the equilibrium policy of the maximum-probability game and the other based on imitation of observed behavior—on all the tested stochastic game envi-ronments.more » « less
An official website of the United States government

