Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

null (Ed.)Driven by recent successes in twoplayer, zerosum game solving and playing, artificial intelligence work on games has increasingly focused on algorithms that produce equilibriumbased strategies. However, this approach has been less effective at producing competent players in generalsum games or those with more than two players than in twoplayer, zerosum games. An appealing alternative is to consider adaptive algorithms that ensure strong performance in hindsight relative to what could have been achieved with modified behavior. This approach also leads to a gametheoretic analysis, but in the correlated play that arises from joint learning dynamics rather than factored agent behavior at equilibrium. We develop and advocate for this hindsight rationality framing of learning in general sequential decisionmaking settings. To this end, we reexamine mediated equilibrium and deviation types in extensiveform games, thereby gaining a more complete understanding and resolving past misconceptions. We present a set of examples illustrating the distinct strengths and weaknesses of each type of equilibrium in the literature, and prove that no tractable concept subsumes all others. This line of inquiry culminates in the definition of the deviation and equilibrium classes that correspond to algorithms in the counterfactual regret minimization (CFR) family, relating them to all others in the literature. Examining CFR in greater detail further leads to a new recursive definition of rationality in correlated play that extends sequential rationality in a way that naturally applies to hindsight evaluation.more » « less

null (Ed.)We present a methodology to robustly estimate the competitive equilibria (CE) of combinatorial markets under the assumption that buyers do not know their precise valuations for bundles of goods, but instead can only provide noisy estimates. We first show tight lower and upperbounds on the buyers' utility loss, and hence the set of CE, given a uniform approximation of one market by another. We then present two probablyapproximatelycorrect algorithms for learning CE with finitesample guarantees. The first is a baseline and the second leverages a connection between the first welfare theorem of economics and uniform approximations to adaptively prune value queries when it is determined that they are provably not part of a CE. Extensive experimentation shows that pruning achieves better estimates than the baseline with far fewer samples.more » « less

null (Ed.)Hindsight rationality is an approach to playing generalsum games that prescribes noregret learning dynamics for individual agents with respect to a set of deviations, and further describes jointly rational behavior among multiple agents with mediated equilibria. To develop hindsight rational learning in sequential decisionmaking settings, we formalize behavioral deviations as a general class of deviations that respect the structure of extensiveform games. Integrating the idea of time selection into counterfactual regret minimization (CFR), we introduce the extensiveform regret minimization (EFR) algorithm that achieves hindsight rationality for any given set of behavioral deviations with computation that scales closely with the complexity of the set. We identify behavioral deviation subsets, the partial sequence deviation types, that subsume previously studied types and lead to efficient EFR instances in games with moderate lengths. In addition, we present a thorough empirical analysis of EFR instantiated with different deviation types in benchmark games, where we find that stronger types typically induce better performance.more » « less