skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on May 7, 2026

Title: Measuring higher-order rationality with belief control
Determining an individual’s strategic reasoning capability based solely on choice data is a complex task. This complexity arises because sophisticated players might have non-equilibrium beliefs about others, leading to non-equilibrium actions. In our study, we pair human participants with computer players known to be fully rational. This use of robot players allows us to disentangle limited reasoning capacity from belief formation and social biases. Our results show that, when paired with robots, subjects consistently demonstrate higher levels of rationality, compared to when paired with human players. Furthermore, players’ rationality levels are relatively stable across games when paired with robot players, even though those with intermediate rationality levels exhibit inconsistency across games. Leveraging our experimental design, we identify and document potential causes of this inconsistency.  more » « less
Award ID(s):
2243268
PAR ID:
10592559
Author(s) / Creator(s):
; ;
Publisher / Repository:
Cambridge University Press
Date Published:
Journal Name:
Experimental Economics
ISSN:
1386-4157
Page Range / eLocation ID:
1 to 28
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Driven by recent successes in two-player, zero-sum game solving and playing, artificial intelligence work on games has increasingly focused on algorithms that produce equilibrium-based strategies. However, this approach has been less effective at producing competent players in general-sum games or those with more than two players than in two-player, zero-sum games. An appealing alternative is to consider adaptive algorithms that ensure strong performance in hindsight relative to what could have been achieved with modified behavior. This approach also leads to a game-theoretic analysis, but in the correlated play that arises from joint learning dynamics rather than factored agent behavior at equilibrium. We develop and advocate for this hindsight rationality framing of learning in general sequential decision-making settings. To this end, we re-examine mediated equilibrium and deviation types in extensive-form games, thereby gaining a more complete understanding and resolving past misconceptions. We present a set of examples illustrating the distinct strengths and weaknesses of each type of equilibrium in the literature, and prove that no tractable concept subsumes all others. This line of inquiry culminates in the definition of the deviation and equilibrium classes that correspond to algorithms in the counterfactual regret minimization (CFR) family, relating them to all others in the literature. Examining CFR in greater detail further leads to a new recursive definition of rationality in correlated play that extends sequential rationality in a way that naturally applies to hindsight evaluation. 
    more » « less
  2. We extend Kreps and Wilson's concept of sequential equilibrium to games with infinite sets of signals and actions. A strategy profile is a conditional ε ‐equilibrium if, for any of a player's positive probability signal events, his conditional expected utility is within ε of the best that he can achieve by deviating. With topologies on action sets, a conditional ε ‐equilibrium is full if strategies give every open set of actions positive probability. Such full conditional ε ‐equilibria need not be subgame perfect, so we consider a non‐topological approach. Perfect conditional ε ‐equilibria are defined by testing conditional ε ‐rationality along nets of small perturbations of the players' strategies and of nature's probability function that, for any action and for almost any state, make this action and state eventually (in the net) always have positive probability. Every perfect conditional ε ‐equilibrium is a subgame perfect ε ‐equilibrium, and, in finite games, limits of perfect conditional ε ‐equilibria as ε  → 0 are sequential equilibrium strategy profiles. But limit strategies need not exist in infinite games so we consider instead the limit distributions over outcomes. We call such outcome distributions perfect conditional equilibrium distributions and establish their existence for a large class of regular projective games. Nature's perturbations can produce equilibria that seem unintuitive and so we augment the game with a net of permissible perturbations. 
    more » « less
  3. We investigate a linear–quadratic stochastic zero-sum game where two players lobby a political representative to invest in a wind farm. Players are time-inconsistent because they discount the utility with a non-constant rate. Our objective is to identify a consistent planning equilibrium in which the players are aware of their inconsistency and cannot commit to a lobbying policy. We analyse equilibrium behaviour in both single-player and two-player cases and compare the behaviours of the game under constant and variable discount rates. The equilibrium behaviour is provided in closed-loop form, either analytically or via numerical approximation. Our numerical analysis of the equilibrium reveals that strategic behaviour leads to more intense lobbying without resulting in overshooting. 
    more » « less
  4. A mediator observes no-regret learners playing an extensive-form game repeatedly across T rounds. The mediator attempts to steer players toward some desirable predetermined equilibrium by giving (nonnegative) payments to players. We call this the steering problem. The steering problem captures problems several problems of interest, among them equilibrium selection and information design (persuasion). If the mediator’s budget is unbounded, steering is trivial because the mediator can simply pay the players to play desirable actions. We study two bounds on the mediator’s payments: a total budget and a per-round budget. If the mediator’s total budget does not grow with T, we show that steering is impossible. However, we show that it is enough for the total budget to grow sublinearly with T, that is, for the average payment to vanish. When players’ full strategies are observed at each round, we show that constant per-round budgets permit steering. In the more challenging setting where only trajectories through the game tree are observable, we show that steering is impossible with constant per-round budgets in general extensive-form games, but possible in normal-form games or if the per-round budget may itself depend on T. We also show how our results can be generalized to the case when the equilibrium is being computed online while steering is happening. We supplement our theoretical positive results with experiments highlighting the efficacy of steering in large games. 
    more » « less
  5. Data-driven modeling increasingly requires to find a Nash equilibrium in multi-player games, e.g. when training GANs. In this paper, we analyse a new extra-gradient method for Nash equilibrium finding, that performs gradient extrapolations and updates on a random subset of players at each iteration. This approach provably exhibits a better rate of convergence than full extra-gradient for non-smooth convex games with noisy gradient oracle. We propose an additional variance reduction mechanism to obtain speed-ups in smooth convex games. Our approach makes extrapolation amenable to massive multiplayer settings, and brings empirical speed-ups, in particular when using a heuristic cyclic sampling scheme. Most importantly, it allows to train faster and better GANs and mixtures of GANs. 
    more » « less