skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Modeling boundedly rational agents with latent inference budgets
We study the problem of modeling a population of agents pursuing unknown goals subject to unknown computational constraints. In standard models of bounded rationality, sub-optimal decision-making is simulated by adding homoscedastic noise to optimal decisions rather than actually simulating constrained inference. In this work, we introduce a latent inference budget model (L-IBM) that models these constraints explicitly, via a latent variable (inferred jointly with a model of agents’ goals) that controls the runtime of an iterative inference algorithm. L-IBMs make it possible to learn agent models using data from diverse populations of suboptimal actors. In three modeling tasks—inferring navigation goals from routes, inferring communicative intents from human utterances, and predicting next moves in human chess games—we show that L-IBMs match or outperforms Boltzmann models of decision-making under uncertainty. Moreover, the inferred inference budgets are themselves meaningful, efficient to compute, and correlated with measures of player skill, partner skill and task difficulty.  more » « less
Award ID(s):
2212310 2238240
PAR ID:
10535717
Author(s) / Creator(s):
; ;
Publisher / Repository:
International Conference on Learning Representations
Date Published:
Format(s):
Medium: X
Location:
International Conference on Learning Representations
Sponsoring Org:
National Science Foundation
More Like this
  1. Akaishi, Rei (Ed.)
    Cognitive rehabilitation, STEM (science, technology, engineering, and math) skill acquisition, and coaching games such as chess often require tutoring decision-making strategies. The advancement of AI-driven tutoring systems for facilitating human learning requires an understanding of the impact of evaluative feedback on human decision-making and skill development. To this end, we conduct human experiments using Amazon Mechanical Turk to study the influence of evaluative feedback on human decision-making in sequential tasks. In these experiments, participants solve the Tower of Hanoi puzzle and receive AI-generated feedback while solving it. We examine how this feedback affects their learning and skill transfer to related tasks. Additionally, treating humans as noisy optimal agents, we employ maximum entropy inverse reinforcement learning to analyze the effect of feedback on the implicit human reward structure that guides their decision making. Lastly, we explore various computational models to understand how people incorporate evaluative feedback into their decision-making processes. Our findings underscore that humans perceive evaluative feedback as indicative of their long-term strategic success, thus aiding in skill acquisition and transfer in sequential decision-making tasks. Moreover, we demonstrate that evaluative feedback fosters a more structured and organized learning experience compared to learning without feedback. Furthermore, our results indicate that providing intermediate goals alone does not significantly enhance human learning outcomes. 
    more » « less
  2. Researchers in human–robot collaboration have extensively studied methods for inferring human intentions and predicting their actions, as this is an important precursor for robots to provide useful assistance. We review contemporary methods for intention inference and human activity prediction. Our survey finds that intentions and goals are often inferred via Bayesian posterior estimation and Markov decision processes that model internal human states as unobserved variables or represent both agents in a shared probabilistic framework. An alternative approach is to use neural networks and other supervised learning approaches to directly map observable outcomes to intentions and to make predictions about future human activity based on past observations. That said, due to the complexity of human intentions, existing work usually reasons about limited domains, makes unrealistic simplifications about intentions, and is mostly constrained to short-term predictions. This state of the art provides opportunity for future research that could include more nuanced models of intents, reason over longer horizons, and account for the human tendency to adapt. 
    more » « less
  3. null (Ed.)
    ABSTRACT The detection of the optical transient AT2017gfo proved that binary neutron star mergers are progenitors of kilonovae (KNe). Using a combination of numerical-relativity and radiative-transfer simulations, the community has developed sophisticated models for these transients for a wide portion of the expected parameter space. Using these simulations and surrogate models made from them, it has been possible to perform Bayesian inference of the observed signals to infer properties of the ejected matter. It has been pointed out that combining inclination constraints derived from the KN with gravitational-wave measurements increases the accuracy with which binary parameters can be estimated, in particular breaking the distance-inclination degeneracy from gravitational wave inference. To avoid bias from the unknown ejecta geometry, constraints on the inclination angle for AT2017gfo should be insensitive to the employed models. In this work, we compare different assumptions about the ejecta and radiative reprocesses used by the community and we investigate their impact on the parameter inference. While most inferred parameters agree, we find disagreement between posteriors for the inclination angle for different geometries that have been used in the current literature. According to our study, the inclusion of reprocessing of the photons between different ejecta types improves the modeling fits to AT2017gfo and, in some cases, affects the inferred constraints. Our study motivates the inclusion of large ∼ 1-mag uncertainties in the KN models employed for Bayesian analysis to capture yet unknown systematics, especially when inferring inclination angles, although smaller uncertainties seem appropriate to capture model systematics for other intrinsic parameters. We can use this method to impose soft constraints on the ejecta geometry of the KN AT2017gfo. 
    more » « less
  4. This paper studies algorithmic decision-making under human's strategic behavior, where a decision maker uses an algorithm to make decisions about human agents, and the latter with information about the algorithm may exert effort strategically and improve to receive favorable decisions. Unlike prior works that assume agents benefit from their efforts immediately, we consider realistic scenarios where the impacts of these efforts are persistent and agents benefit from efforts by making improvements gradually. We first develop a dynamic model to characterize persistent improvements and based on this construct a Stackelberg game to model the interplay between agents and the decision-maker. We analytically characterize the equilibrium strategies and identify conditions under which agents have incentives to improve. With the dynamics, we then study how the decision-maker can design an optimal policy to incentivize the largest improvements inside the agent population. We also extend the model to settings where 1) agents may be dishonest and game the algorithm into making favorable but erroneous decisions; 2) honest efforts are forgettable and not sufficient to guarantee persistent improvements. With the extended models, we further examine conditions under which agents prefer honest efforts over dishonest behavior and the impacts of forgettable efforts. 
    more » « less
  5. This paper considers quickest detection scheme where the change in an underlying parameter influencing human decisions is to be detected by only observing the human decisions. Stemming from behavioral economics and mathematical psychology, we propose two generative models for the human decision maker. Namely, we consider an anticipatory decision making model and a quantum decision model. From a decision theoretic point of view, anticipatory models are time inconsistent, meaning that Bellman's principle of optimality does not hold. The appropriate formalism is thus the subgame Nash equilibrium. We show that the interaction between anticipatory agents and sequential quickest detection results in unusual (nonconvex) structure of the quickest change detection policy. In contrast the quantum decision model, despite its mathematical complexity, results in the typical convex quickest detection policy. The optimal quickest detection policy is shown to perform strictly worse than classical quickest detection for both models, via a Blackwell dominance argument. The model and structural results provided contribute to an understanding of the dynamics of human-sensor interfacing. 
    more » « less