In this paper, pursuitevasion scenarios in a stochastic flow field involving one pursuer and one evader are analyzed. Using a forward reachability setbased approach and the associated level set equations, nominal solutions of the players are generated. The dynamical system is linearized along the nominal solution to formulate a chanceconstrained, linearquadratic stochastic dynamic game. Assuming an affine disturbance feedback structure, the proposed game is solved using the standard GaussSeidel iterative scheme. Numerical simulations demonstrate the proposed approach for realistic flow fields.
BoundedRational PursuitEvasion Games
We present a framework that incorporates the principle of bounded rationality into dynamic stochastic pursuitevasion games. The solution of a stochastic game is
generally characterized by its (Nash) equilibria in feedback form, whose calculation may require extensive computational resources. In this paper, the agents are modeled as bounded rational entities with limited computational capabilities. We illustrate the proposed framework by applying it to a pursuitevasion game between two aerial vehicles in a stochastic wind field. We show how such a game may be discretized and
properly analyzed by casting it as an iterative sequence of finitestate Markov Decision Processes (MDPs). Leveraging tools and algorithms from the cognitive hierarchy theory (“levelk thinking”) we compute the solution of the ensuing discrete game, while taking into consideration the rationality level of each agent. We also present an online algorithm for
each agent to infer its opponent's rationality level.
 Award ID(s):
 1849130
 Publication Date:
 NSFPAR ID:
 10315945
 Journal Name:
 American Control Conference
 Sponsoring Org:
 National Science Foundation
More Like this


We consider the highdimensional linear regression problem, where the algorithmic goal is to efficiently infer an unknown feature vector $\beta^*\in\mathbb{R}^p$ from its linear measurements, using a small number $n$ of samples. Unlike most of the literature, we make no sparsity assumption on $\beta^*$, but instead adopt a different regularization: In the noiseless setting, we assume $\beta^*$ consists of entries, which are either rational numbers with a common denominator $Q\in\mathbb{Z}^+$ (referred to as $Q$rationality); or irrational numbers taking values in a rationally independent set of bounded cardinality, known to learner; collectively called as the mixedrange assumption. Using a novel combination of the Partial Sum of Least Squares (PSLQ) integer relation detection, and the LenstraLenstraLov\'asz (LLL) lattice basis reduction algorithms, we propose a polynomialtime algorithm which provably recovers a $\beta^*\in\mathbb{R}^p$ enjoying the mixedrange assumption, from its linear measurements $Y=X\beta^*\in\mathbb{R}^n$ for a large class of distributions for the random entries of $X$, even with one measurement ($n=1$). In the noisy setting, we propose a polynomialtime, latticebased algorithm, which recovers a $\beta^*\in\mathbb{R}^p$ enjoying the $Q$rationality property, from its noisy measurements $Y=X\beta^*+W\in\mathbb{R}^n$, even from a single sample ($n=1$). We further establish that for large $Q$, and normal noise, this algorithm tolerates informationtheoretically optimal level ofmore »

The human ability to deceive others and detect deception has long been tied to theory of mind. We make a stronger argument: in order to be adept liars – to balance gain (i.e. maximizing their own reward) and plausibility (i.e. maintaining a realistic lie) – humans calibrate their lies under the assumption that their partner is a rational, utilitymaximizing agent. We develop an adversarial recursive Bayesian model that aims to formalize the behaviors of liars and lie detectors. We compare this model to (1) a model that does not perform theory of mind computations and (2) a model that has perfect knowledge of the opponent’s behavior. To test these models, we introduce a novel dyadic, stochastic game, allowing for quantitative measures of lies and lie detection. In a second experiment, we vary the ground truth probability. We find that our rational models qualitatively predict human lying and lie detecting behavior better than the nonrational model. Our findings suggest that humans control for the extremeness of their lies in a manner reflective of rational social inference. These findings provide a new paradigm and formal framework for nuanced quantitative analysis of the role of rationality and theory of mind in lying andmore »

In computational approaches to bounded rationality, metareasoning enables intelligent agents to optimize their own decisionmaking process in order to produce effective action in a timely manner. While there have been substantial efforts to develop effective metalevel control for anytime algorithms, existing techniques rely on extensive offline work, imposing several critical assumptions that diminish their effectiveness and limit their practical utility in the real world. In order to eliminate these assumptions, adaptive metareasoning enables intelligent agents to adapt to each individual instance of the problem at hand without the need for significant offline preprocessing. Building on our recent work, we first introduce a modelfree approach to metalevel control based on reinforcement learning. We then present a metalevel control technique that uses temporal difference learning. Finally, we show empirically that our approach is effective on a common benchmark in metalevel control.

Driven by recent successes in twoplayer, zerosum game solving and playing, artificial intelligence work on games has increasingly focused on algorithms that produce equilibriumbased strategies. However, this approach has been less effective at producing competent players in generalsum games or those with more than two players than in twoplayer, zerosum games. An appealing alternative is to consider adaptive algorithms that ensure strong performance in hindsight relative to what could have been achieved with modified behavior. This approach also leads to a gametheoretic analysis, but in the correlated play that arises from joint learning dynamics rather than factored agent behavior at equilibrium. We develop and advocate for this hindsight rationality framing of learning in general sequential decisionmaking settings. To this end, we reexamine mediated equilibrium and deviation types in extensiveform games, thereby gaining a more complete understanding and resolving past misconceptions. We present a set of examples illustrating the distinct strengths and weaknesses of each type of equilibrium in the literature, and prove that no tractable concept subsumes all others. This line of inquiry culminates in the definition of the deviation and equilibrium classes that correspond to algorithms in the counterfactual regret minimization (CFR) family, relating them to all others inmore »