skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Bilevel Entropy Based Mechanism Design for Balancing Meta in Video Games
We address a mechanism design problem where the goal of the designer is to maximize the entropy of a player's mixed strategy at a Nash equilibrium. This objective is of special relevance to video games where game designers wish to diversify the players' interaction with the game. To solve this design problem, we propose a bi-level alternating optimization technique that (1) approximates the mixed strategy Nash equilibrium using a Nash Monte-Carlo reinforcement learning approach and (2) applies a gradient-free optimization technique (Covariance-Matrix Adaptation Evolutionary Strategy) to maximize the entropy of the mixed strategy obtained in level (1). The experimental results show that our approach achieves comparable results to the state-of-the-art approach on three benchmark domains "Rock-Paper-Scissors-Fire-Water", "Workshop Warfare" and "Pokemon Video Game Championship". Next, we show that, unlike previous state-of-the-art approaches, the computational complexity of our proposed approach scales significantly better in larger combinatorial strategy spaces.  more » « less
Award ID(s):
2238979
PAR ID:
10498042
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
International Foundation for Autonomous Agents and Multiagent Systems
Date Published:
Journal Name:
Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems
ISBN:
9781450394321
Format(s):
Medium: X
Location:
London, United Kingdom
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper, we develop an optimal weight adap- tation strategy of model predictive control (MPC) for connected and automated vehicles (CAVs) in mixed traffic. We model the interaction between a CAV and a human-driven vehicle (HDV) as a simultaneous game and formulate a game-theoretic MPC problem to find a Nash equilibrium of the game. In the MPC problem, the weights in the HDV’s objective function can be learned online using moving horizon inverse reinforcement learning. Using Bayesian optimization, we propose a strategy to optimally adapt the weights in the CAV’s objective function so that the expected true cost when using MPC in simulations can be minimized. We validate the effectiveness of the optimal strategy by numerical simulations of a vehicle crossing example at an unsignalized intersection. 
    more » « less
  2. We develop a probabilistic approach to continuous-time finite state mean field games. Based on an alternative description of continuous-time Markov chain by means of semimartingale and the weak formulation of stochastic optimal control, our approach not only allows us to tackle the mean field of states and the mean field of control in the same time, but also extend the strategy set of players from Markov strategies to closed-loop strategies. We show the existence and uniqueness of Nash equilibrium for the mean field game, as well as how the equilibrium of mean field game consists of an approximative Nash equilibrium for the game with finite number of players under different assumptions of structure and regularity on the cost functions and transition rate between states. 
    more » « less
  3. null (Ed.)
    We develop a probabilistic approach to continuous-time finite state mean field games. Based on an alternative description of continuous-time Markov chains by means of semimartingales and the weak formulation of stochastic optimal control, our approach not only allows us to tackle the mean field of states and the mean field of control at the same time, but also extends the strategy set of players from Markov strategies to closed-loop strategies. We show the existence and uniqueness of Nash equilibrium for the mean field game as well as how the equilibrium of a mean field game consists of an approximative Nash equilibrium for the game with a finite number of players under different assumptions of structure and regularity on the cost functions and transition rate between states. 
    more » « less
  4. This paper introduces a game-theoretical strategy for optimal dispatch of building thermal loads, based on a marginal price model derived from an actual dispatch curve. A non-cooperative game is formulated, and the existence and uniqueness of the Nash equilibrium solution are proved aided by the variational inequality theory. A game solution algorithm is presented in this paper to solve the control problem with guaranteed convergence. The proposed game-theoretical control technique was evaluated against a baseline energy minimization strategy and a socially optimal solution, through a simulation test of a virtual market comprised of six buildings. The results show that the proposed game-theoretical strategy could achieve performance very close to the social optimum with a Price of Anarchy of 1.0041 and a 24% cost reduction compared to the baseline energy-priority strategy. 
    more » « less
  5. Abstract Global product platforms can reduce production costs through economies of scale and learning but may decrease revenues by restricting the ability to customize for each market. We model the global platforming problem as a Nash equilibrium among oligopolistic competing firms, each maximizing its profit across markets with respect to its pricing, design, and platforming decisions. We develop and compare two methods to identify Nash equilibria: (1) a sequential iterative optimization (SIO) algorithm, in which each firm solves a mixed-integer nonlinear programming problem globally, with firms iterating until convergence; and (2) a mathematical program with equilibrium constraints (MPEC) that solves the Karush Kuhn Tucker conditions for all firms simultaneously. The algorithms’ performance and results are compared in a case study of plug-in hybrid electric vehicles where firms choose optimal battery capacity and whether to platform or differentiate battery capacity across the US and Chinese markets. We examine a variety of scenarios for (1) learning rate and (2) consumer willingness to pay (WTP) for range in each market. For the case of two firms, both approaches find the Nash equilibrium in all scenarios. On average, the SIO approach solves 200 times faster than the MPEC approach, and the MPEC approach is more sensitive to the starting point. Results show that the optimum for each firm is to platform when learning rates are high or the difference between consumer willingness to pay for range in each market is relatively small. Otherwise, the PHEVs are differentiated with low-range for China and high-range for the US. 
    more » « less