skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Cost Inference for Feedback Dynamic Games from Noisy Partial State Observations and Incomplete Trajectories
In multi-agent dynamic games, the Nash equilibrium state trajectory of each agent is determined by its cost function and the information pattern of the game. However, the cost and trajectory of each agent may be unavailable to the other agents. Prior work on using partial observations to infer the costs in dynamic games assumes an open-loop information pattern. In this work, we demonstrate that the feedback Nash equilibrium concept is more expressive and encodes more complex behavior. It is desirable to develop specific tools for inferring players’ objectives in feedback games. Therefore, we consider the dynamic game cost inference problem under the feedback information pattern, using only partial state observations and incomplete trajectory data. To this end, we first propose an inverse feedback game loss function, whose minimizer yields a feedback Nash equilibrium state trajectory closest to the observa- tion data. We characterize the landscape and differentiability of the loss function. Given the difficulty of obtaining the exact gradient, our main contribution is an efficient gradient approximator, which enables a novel inverse feedback game solver that minimizes the loss using first-order optimization. In thorough empirical evaluations, we demonstrate that our algorithm converges reliably and has better robustness and generalization performance than the open-loop baseline method when the observation data reflects a group of players acting in a feedback Nash game.  more » « less
Award ID(s):
2211548
PAR ID:
10511440
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
International Foundation for Autonomous Agents and Multiagent Systems
Date Published:
Journal Name:
Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems
ISBN:
9781450394321
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Markov games model interactions among multiple players in a stochastic, dynamic environment. Each player in a Markov game maximizes its expected total discounted reward, which depends upon the policies of the other players. We formulate a class of Markov games, termed affine Markov games, where an affine reward function couples the players’ actions. We introduce a novel solution concept, the soft-Bellman equilibrium, where each player is boundedly rational and chooses a soft-Bellman policy rather than a purely rational policy as in the well-known Nash equilibrium concept. We provide conditions for the existence and uniqueness of the soft-Bellman equilibrium and propose a nonlinear least-squares algorithm to compute such an equilibrium in the forward problem. We then solve the inverse game problem of inferring the players’ reward parameters from observed state-action trajectories via a projected-gradient algorithm. Experiments in a predator-prey OpenAI Gym environment show that the reward parameters inferred by the proposed algorithm outper- form those inferred by a baseline algorithm: they reduce the Kullback-Leibler divergence between the equilibrium policies and observed policies by at least two orders of magnitude. 
    more » « less
  2. null (Ed.)
    In this paper, zero-sum mean-field type games (ZSMFTG) with linear dynamics and quadratic cost are studied under infinite-horizon discounted utility function. ZSMFTG are a class of games in which two decision makers whose utilities sum to zero, compete to influence a large population of indistinguishable agents. In particular, the case in which the transition and utility functions depend on the state, the action of the controllers, and the mean of the state and the actions, is investigated. The optimality conditions of the game are analysed for both open-loop and closed-loop controls, and explicit expressions for the Nash equilibrium strategies are derived. Moreover, two policy optimization methods that rely on policy gradient are proposed for both model-based and sample-based frameworks. In the model-based case, the gradients are computed exactly using the model, whereas they are estimated using Monte-Carlo simulations in the sample-based case. Numerical experiments are conducted to show the convergence of the utility function as well as the two players' controls. 
    more » « less
  3. We develop a probabilistic approach to continuous-time finite state mean field games. Based on an alternative description of continuous-time Markov chain by means of semimartingale and the weak formulation of stochastic optimal control, our approach not only allows us to tackle the mean field of states and the mean field of control in the same time, but also extend the strategy set of players from Markov strategies to closed-loop strategies. We show the existence and uniqueness of Nash equilibrium for the mean field game, as well as how the equilibrium of mean field game consists of an approximative Nash equilibrium for the game with finite number of players under different assumptions of structure and regularity on the cost functions and transition rate between states. 
    more » « less
  4. An atomic routing game is a multiplayer game on a directed graph. Each player in the game chooses a path—a sequence of links that connect its origin node to its destination node—with the lowest cost, where the cost of each link is a function of all players’ choices. We develop a novel numerical method to design the link cost function in atomic routing games such that the players’ choices at the Nash equilibrium minimize a given smooth performance function. This method first approximates the nonsmooth Nash equilibrium conditions with smooth ones, then iteratively improves the link cost function via implicit differentiation. We demonstrate the application of this method to atomic routing games that model noncooperative agents navigating in grid worlds. 
    more » « less
  5. null (Ed.)
    We develop a probabilistic approach to continuous-time finite state mean field games. Based on an alternative description of continuous-time Markov chains by means of semimartingales and the weak formulation of stochastic optimal control, our approach not only allows us to tackle the mean field of states and the mean field of control at the same time, but also extends the strategy set of players from Markov strategies to closed-loop strategies. We show the existence and uniqueness of Nash equilibrium for the mean field game as well as how the equilibrium of a mean field game consists of an approximative Nash equilibrium for the game with a finite number of players under different assumptions of structure and regularity on the cost functions and transition rate between states. 
    more » « less