skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 8:00 PM ET on Friday, March 21 until 8:00 AM ET on Saturday, March 22 due to maintenance. We apologize for the inconvenience.


Title: Multi-agent motion planning using differential games with lexicographic preferences
Multi-player games with lexicographic cost functions can capture a variety of driving and racing scenarios and are known to have pure-strategy Nash Equilibria (NE) under certain conditions. The standard Iterated Best Response (IBR) procedure for finding such equilibria can be slow because computing the best response for each agent generally involves solving a non-convex optimization problem. In this paper, we introduce a type of game which uses a lexicographic cost function. We show that for this class of games, the best responses can be effectively computed through piece-wise linear approximations. This enables us to approximate the NE using a linearized version of IBR. We show the gap between the linear approximations returned by our linearized IBR and the true best response drops asymptotically. We implement the algorithm and show that it can find approximate NE for a handful of agents driving in realistic scenarios in under 10 seconds.  more » « less
Award ID(s):
1918531
PAR ID:
10488133
Author(s) / Creator(s):
;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
2022 IEEE 61st Conference on Decision and Control (CDC)
ISBN:
978-1-6654-6761-2
Page Range / eLocation ID:
5751 to 5756
Format(s):
Medium: X
Location:
Cancun, Mexico
Sponsoring Org:
National Science Foundation
More Like this
  1. Multi-player games with lexicographic cost functions can capture a variety of driving and racing scenarios and under certain conditions are known to have pure-strategy Nash Equilibria. The standard Iterated Best Response (IBR) procedure for finding such equilibria can be slow because, in general, computing the best response for each agent involves solving a non-convex optimization problem. In this paper, we introduce a type of game which uses a lexicographic cost function. We show that for this class of games, the best responses can be effectively computed through piece-wise linear approximations. This in turn enables us to approximate the Nash Equilibria using a linearized version of IBR. We show that the gap between the linear approximations returned by our linearized IBR and the true best response drops asymptotically. We have implemented the algorithm and our experiments show that it can find approximate Nash Equilibria for handful of agents driving in realistic scenarios in less than 10 seconds. 
    more » « less
  2. We report on the formalization in Ssreflect/Coq of a number of concepts and results from algorithmic game theory, including potential games, smooth games, solution concepts such as Pure and Mixed Nash Equilibria, Coarse Correlated Equilibria, epsilon-approximate equilibria, and behavioral models of games such as best-response dynamics. We apply the formalization to prove Price of Stability bounds for, and convergence under best-response dynamics of, the Atomic Routing game, which has applications in computer networking. Our second application proves that Affine Congestion games are (5/3, 1/3)-smooth, and therefore have Price of Anarchy 5/2. Our formalization is available online. 
    more » « less
  3. This paper investigates when one can efficiently recover an approximate Nash Equilibrium (NE) in offline congestion games. The existing dataset coverage assumption in offline general-sum games inevitably incurs a dependency on the number of actions, which can be exponentially large in congestion games. We consider three different types of feedback with decreasing revealed information. Starting from the facility-level (a.k.a., semi-bandit) feedback, we propose a novel one-unit deviation coverage condition and show a pessimism-type algorithm that can recover an approximate NE. For the agent-level (a.k.a., bandit) feedback setting, interestingly, we show the one-unit deviation coverage condition is not sufficient. On the other hand, we convert the game to multi-agent linear bandits and show that with a generalized data coverage assumption in offline linear bandits, we can efficiently recover the approximate NE. Lastly, we consider a novel type of feedback, the game-level feedback where only the total reward from all agents is revealed. Again, we show the coverage assumption for the agent-level feedback setting is insufficient in the game-level feedback setting, and with a stronger version of the data coverage assumption for linear bandits, we can recover an approximate NE. Together, our results constitute the first study of offline congestion games and imply formal separations between different types of feedback. 
    more » « less
  4. Evolutionary anti-coordination games on networks capture real-world strategic situations such as traffic routing and market competition. Two key problems concerning evolutionary games are the existence of a pure Nash equilibrium (NE) and the convergence time. In this work, we study these two problems for anti-coordination games under sequential and synchronous update schemes. For each update scheme, we examine two decision modes based on whether an agent considers its own previous action (self essential) or not (self non-essential) in choosing its next action. Using a relationship between games and dynamical systems, we show that for both update schemes, finding an NE can be done efficiently under the self non-essential mode but is computationally intractable under the self essential mode. We then identify special cases for which an NE can be obtained efficiently. For convergence time, we show that the dynamics converges in a polynomial number of steps under the synchronous scheme; for the sequential scheme, the convergence time is polynomial only under the self non-essential mode. Through experiments, we empirically examine the convergence time and the equilibria for both synthetic and real-world networks.

     
    more » « less
  5. Policy gradient methods enjoy strong practical performance in numerous tasks in reinforcement learning. Their theoretical understanding in multiagent settings, however, remains limited, especially beyond two-player competitive and potential Markov games. In this paper, we develop a new framework to characterize optimistic policy gradient methods in multi-player Markov games with a single controller. Specifically, under the further assumption that the game exhibits an equilibrium collapse, in that the marginals of coarse correlated equilibria (CCE) induce Nash equilibria (NE), we show convergence to stationary ϵ-NE in O(1/ϵ2) iterations, where O(⋅) suppresses polynomial factors in the natural parameters of the game. Such an equilibrium collapse is well-known to manifest itself in two-player zero-sum Markov games, but also occurs even in a class of multi-player Markov games with separable interactions, as established by recent work. As a result, we bypass known complexity barriers for computing stationary NE when either of our assumptions fails. Our approach relies on a natural generalization of the classical Minty property that we introduce, which we anticipate to have further applications beyond Markov games. 
    more » « less