skip to main content


Title: Scalable Equilibrium Computation in Multi-agent Influence Games on Networks
We provide a polynomial-time, scalable algorithm for equilibrium computation in multi-agent influence games on networks, extending work of Bindel, Kleinberg, and Oren (2015) from the single-agent to the multi-agent setting. In games of influence, agents have limited advertising budget to influence the initial predisposition of nodes in some network towards their products, but the eventual decisions of the nodes are determined by the stationary state of DeGroot opinion dynamics on the network, which takes over after the seeding (Ahmadinejad et al. 2014, 2015). In multi-agent systems, how should agents spend their budgets to seed the network to maximize their utility in anticipation of other advertising agents and the network dynamics? We show that Nash equilibria of this game are pure and (under weak assumptions) unique, and can be computed in polynomial time; we test our model by computing equilibria using mirror descent for the two-agent case on random graphs.  more » « less
Award ID(s):
1846237 1852352
NSF-PAR ID:
10315231
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the AAAI Conference on Artificial Intelligence
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The existence of simple uncoupled no-regret learning dynamics that converge to correlated equilibria in normal-form games is a celebrated result in the theory of multi-agent systems. Specifically, it has been known for more than 20 years that when all players seek to minimize their internal regret in a repeated normal-form game, the empirical frequency of play converges to a normal-form correlated equilibrium. Extensive-form games generalize normal-form games by modeling both sequential and simultaneous moves, as well as imperfect information. Because of the sequential nature and presence of private information in the game, correlation in extensive-form games possesses significantly different properties than its counterpart in normal-form games, many of which are still open research directions. Extensive-form correlated equilibrium (EFCE) has been proposed as the natural extensive-form counterpart to the classical notion of correlated equilibrium in normal-form games. Compared to the latter, the constraints that define the set of EFCEs are significantly more complex, as the correlation device must keep into account the evolution of beliefs of each player as they make observations throughout the game. Due to that significant added complexity, the existence of uncoupled learning dynamics leading to an EFCE has remained a challenging open research question for a long time. In this article, we settle that question by giving the first uncoupled no-regret dynamics that converge to the set of EFCEs in n-player general-sum extensive-form games with perfect recall. We show that each iterate can be computed in time polynomial in the size of the game tree, and that, when all players play repeatedly according to our learning dynamics, the empirical frequency of play is proven to be a O(T^-0.5)-approximate EFCE with high probability after T game repetitions, and an EFCE almost surely in the limit. 
    more » « less
  2. Andreas Krause, Emma Brunskill (Ed.)
    Executing actions in a correlated manner is a common strategy for human coordination that often leads to better cooperation, which is also potentially beneficial for cooperative multi-agent reinforcement learning (MARL). However, the recent success of MARL relies heavily on the convenient paradigm of purely decentralized execution, where there is no action correlation among agents for scalability considerations. In this work, we introduce a Bayesian network to inaugurate correlations between agents’ action selections in their joint policy. Theoretically, we establish a theoretical justification for why action dependencies are beneficial by deriving the multi-agent policy gradient formula under such a Bayesian network joint policy and proving its global convergence to Nash equilibria under tabular softmax policy parameterization in cooperative Markov games. Further, by equipping existing MARL algorithms with a recent method of differentiable directed acyclic graphs (DAGs), we develop practical algorithms to learn the context-aware Bayesian network policies in scenarios with partial observability and various difficulty. We also dynamically decrease the sparsity of the learned DAG throughout the training process, which leads to weakly or even purely independent policies for decentralized execution. Empirical results on a range of MARL benchmarks show the benefits of our approach. 
    more » « less
  3. Evolutionary anti-coordination games on networks capture real-world strategic situations such as traffic routing and market competition. Two key problems concerning evolutionary games are the existence of a pure Nash equilibrium (NE) and the convergence time. In this work, we study these two problems for anti-coordination games under sequential and synchronous update schemes. For each update scheme, we examine two decision modes based on whether an agent considers its own previous action (self essential) or not (self non-essential) in choosing its next action. Using a relationship between games and dynamical systems, we show that for both update schemes, finding an NE can be done efficiently under the self non-essential mode but is computationally intractable under the self essential mode. We then identify special cases for which an NE can be obtained efficiently. For convergence time, we show that the dynamics converges in a polynomial number of steps under the synchronous scheme; for the sequential scheme, the convergence time is polynomial only under the self non-essential mode. Through experiments, we empirically examine the convergence time and the equilibria for both synthetic and real-world networks.

     
    more » « less
  4. Networked public goods games model scenarios in which self-interested agents decide whether or how much to invest in an action that benefits not only themselves, but also their network neighbors. Examples include vaccination, security investment, and crime reporting. While every agent's utility is increasing in their neighbors' joint investment, the specific form can vary widely depending on the scenario. A principal, such as a policymaker, may wish to induce large investment from the agents. Besides direct incentives, an important lever here is the network structure itself: by adding and removing edges, for example, through community meetings, the principal can change the nature of the utility functions, resulting in different, and perhaps socially preferable, equilibrium outcomes. We initiate an algorithmic study of targeted network modifications with the goal of inducing equilibria of a particular form. We study this question for a variety of equilibrium forms (induce all agents to invest, at least a given set S, exactly a given set S, at least k agents), and for a variety of utility functions. While we show that the problem is NP-complete for a number of these scenarios, we exhibit a broad array of scenarios in which the problem can be solved in polynomial time by non-trivial reductions to (minimum-cost) matching problems. 
    more » « less
  5. The mammalian suprachiasmatic nucleus (SCN) comprises about 20,000 interconnected oscillatory neurons that create and maintain a robust circadian signal which matches to external light cues. Here, we use an evolutionary game theoretic framework to explore how evolutionary constraints can influence the synchronization of the system under various assumptions on the connection topology, contributing to the understanding of the structure of interneuron connectivity. Our basic model represents the SCN as a network of agents each with two properties—a phase and a flag that determines if it communicates with its neighbors or not. Communication comes at a cost to the agent, but synchronization of phases with its neighbors bears a benefit. Earlier work shows that when we have “all-to-all” connectivity, where every agent potentially communicates with every other agent, there is often a simple trade-off that leads to complete communication and synchronization of the system: the benefit must be greater than twice the cost. This trade-off for all-to-all connectivity gives us a baseline to compare to when looking at other topologies. Using simulations, we compare three plausible topologies to the all-to-all case, finding that convergence to synchronous dynamics occurs in all considered topologies under similar benefit and cost trade-offs. Consequently, sparser, less biologically costly topologies are reasonable evolutionary outcomes for organisms that develop a synchronizable oscillatory network. Our simulations also shed light on constraints imposed by the time scale on which we observe the SCN to arise in mammals. We find two conditions that allow for a synchronizable system to arise in relatively few generations. First, the benefits of connectivity must outweigh the cost of facilitating the connectivity in the network. Second, the game at the core of the model needs to be more cooperative than antagonistic games such as the Prisoner’s Dilemma. These results again imply that evolutionary pressure may have driven the system towards sparser topologies, as they are less costly to create and maintain. Last, our simulations indicate that models based on the mutualism game fare the best in uptake of communication and synchronization compared to more antagonistic games such as the Prisoner’s Dilemma. 
    more » « less