NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

State-Constrained Zero-Sum Differential Games with One-Sided Information

Ghimire, Mukesh; Zhang, Lei; Xu, Zhe; Ren, Yi (July 2024, International Conference on Machine Learning)

We study zero-sum differential games with state constraints and one-sided information, where the informed player (Player 1) has a categorical payoff type unknown to the uninformed player (Player 2). The goal of Player 1 is to minimize his payoff without violating the constraints, while that of Player 2 is to either violate the state constraints, or otherwise, to maximize the payoff. One example of the game is a man-to-man matchup in football. Without state constraints, Cardaliaguet (2007) showed that the value of such a game exists and is convex to the common belief of players. Our theoretical contribution is an extension of this result to differential games with state constraints and the derivation of the primal and dual subdynamic principles necessary for computing the behavioral strategies. Compared with existing works on imperfect-information dynamic games that focus on scalability and generalization, our focus is instead on revealing the mechanism of belief manipulation behaviors resulted from information asymmetry and state constraints. We use a simplified football game to demonstrate the utility of this work, where we reveal player positions and belief states in which the attacker should (or should not) play specific random fake moves to take advantage of information asymmetry, and compute how the defender should respond.
more » « less
Full Text Available
Pontryagin neural operator for solving general-sum differential games with parametric state constraints

Zhang, Lei; Ghimire, Mukesh; Xu, Zhe; Zhang, Wenlong; Ren, Yi (July 2024, 6th Annual Learning for Dynamics & Control Conference)

The values of two-player general-sum differential games are viscosity solutions to Hamilton-Jacobi-Isaacs (HJI) equations. Value and policy approximations for such games suffer from the curse of dimensionality (CoD). Alleviating CoD through physics-informed neural networks (PINN) encounters convergence issues when value discontinuity is present due to state constraints. On top of these challenges, it is often necessary to learn generalizable values and policies across a parametric space of games, eg, for game parameter inference when information is incomplete. To address these challenges, we propose in this paper a Pontryagin-mode neural operator that outperforms existing state-of-the-art (SOTA) on safety performance across games with parametric state constraints. Our key contribution is the introduction of a costate loss defined on the discrepancy between forward and backward costate rollouts, which are computationally cheap. We show that the discontinuity of costate dynamics (in the presence of state constraints) effectively enables the learning of discontinuous values, without requiring manually supervised data as suggested by the current SOTA. More importantly, we show that the close relationship between costates and policies makes the former critical in learning feedback control policies with generalizable safety performance.
more » « less
Full Text Available
Solving Two-Player General-Sum Game Between Swarms

https://doi.org/10.23919/ACC60939.2024.10644320

Ghimire, Mukesh; Zhang, Lei; Zhang, Wenlong; Ren, Yi; Xu, Zhe (July 2024, IEEE)

Full Text Available
Value Approximation for Two-Player General-Sum Differential Games With State Constraints

https://doi.org/10.1109/TRO.2024.3411850

Zhang, Lei; Ghimire, Mukesh; Zhang, Wenlong; Xu, Zhe; Ren, Yi (January 2024, IEEE Transactions on Robotics)

Full Text Available
Approximating Discontinuous Nash Equilibrial Values of Two-Player General-Sum Differential Games

https://doi.org/10.1109/ICRA48891.2023.10160219

Zhang, Lei; Ghimire, Mukesh; Zhang, Wenlong; Xu, Zhe; Ren, Yi (May 2023, 2023 IEEE International Conference on Robotics and Automation (ICRA))

Finding Nash equilibrial policies for two-player differential games requires solving Hamilton-Jacobi-Isaacs (HJI) PDEs. Self-supervised learning has been used to approximate solutions of such PDEs while circumventing the curse of dimensionality. However, this method fails to learn discontinuous PDE solutions due to its sampling nature, leading to poor safety performance of the resulting controllers in robotics applications when player rewards are discontinuous. This paper investigates two potential solutions to this problem: a hybrid method that leverages both supervised Nash equilibria and the HJI PDE, and a value-hardening method where a sequence of HJIs are solved with a gradually hardening reward. We compare these solutions using the resulting generalization and safety performance in two vehicle interaction simulation studies with 5D and 9D state spaces, respectively. Results show that with informative supervision (e.g., collision and near-collision demonstrations) and the low cost of self-supervised learning, the hybrid method achieves better safety performance than the supervised, self-supervised, and value hardening approaches on equal computational budget. Value hardening fails to generalize in the higher-dimensional case without informative supervision. Lastly, we show that the neural activation function needs to be continuously differentiable for learning PDEs and its choice can be case dependent.
more » « less
Full Text Available
When Shall I Estimate Your Intent? Costs and Benefits of Intent Inference in Multi-Agent Interactions

https://doi.org/10.23919/ACC53348.2022.9867155

Amatya, Sunny; Ghimire, Mukesh; Ren, Yi; Xu, Zhe; Zhang, Wenlong (June 2022, 2022 American Control Conference (ACC))

This paper addresses incomplete-information dynamic games, where reward parameters of agents are private. Previous studies have shown that online belief update is necessary for deriving equilibrial policies of such games, especially for high-risk games such as vehicle interactions. However, updating beliefs in real time is computationally expensive as it requires continuous computation of Nash equilibria of the sub-games starting from the current states. In this paper, we consider the triggering mechanism of belief update as a policy defined on the agents’ physical and belief states, and propose learning this policy through reinforcement learning (RL). Using a two-vehicle uncontrolled intersection case, we show that intermittent belief update via RL is sufficient for safe interactions, reducing the computation cost of updates by 59% when agents have full observations of physical states. Simulation results also show that the belief update frequency will increase as noise becomes more significant in measurements of the vehicle positions.
more » « less
Full Text Available

Search for: All records