skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Capturing Humans’ Mental Models of AI: An Item Response Theory Approach
Improving our understanding of how humans perceive AI teammates is an important foundation for our general understanding of human-AI teams. Extending relevant work from cognitive science, we propose a framework based on item response theory for modeling these perceptions. We apply this framework to real-world experiments, in which each participant works alongside another person or an AI agent in a question-answering setting, repeatedly assessing their teammate’s performance. Using this experimental data, we demonstrate the use of our framework for testing research questions about people’s perceptions of both AI agents and other people. We contrast mental models of AI teammates with those of human teammates as we characterize the dimensionality of these mental models, their development over time, and the influence of the participants’ own self-perception. Our results indicate that people expect AI agents’ performance to be significantly better on average than the performance of other humans, with less variation across different types of problems. We conclude with a discussion of the implications of these findings for human-AI interaction.  more » « less
Award ID(s):
1900644
PAR ID:
10460956
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency
Page Range / eLocation ID:
1723-1734
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Effective human-AI collaboration requires agents to adopt their roles and levels of support based on human needs, task requirements, and complexity. Traditional human-AI teaming often relies on a pre-determined robot communication scheme, restricting teamwork adaptability in complex tasks. Leveraging the strong communication capabilities of Large Language Models (LLMs), we propose a Human-Robot Teaming Framework with Multi-Modal Language feedback (HRT-ML), a framework designed to enhance human-robot interaction by adjusting the frequency and content of language-based feedback. The HRT-ML framework includes two core modules: a Coordinator for high-level, low-frequency strategic guidance and a Manager for task-specific, high-frequency instructions, enabling passive and active interactions with human teammates. To assess the impact of language feedback in collaborative scenarios, we conducted experiments in an enhanced Overcooked-AI game environment with varying levels of task complexity (easy, medium, hard) and feedback frequency (inactive, passive, active, superactive). Our results show that as task complexity increases relative to human capabilities, human teammates exhibited stronger preferences toward robotic agents that can offer frequent, proactive support. However, when task complexities exceed the LLM's capacity, noisy and inaccurate feedback from superactive agents can instead hinder team performance, as it requires human teammates to increase their effort to interpret and respond to the large amount of communications, with limited performance return. Our results offer a general principle for robotic agents to dynamically adjust their levels and frequencies of communication to work seamlessly with humans and achieve improved teaming performance. 
    more » « less
  2. A prerequisite for social coordination is bidirectional communication between teammates, each playing two roles simultaneously: as receptive listeners and expressive speakers. For robots working with humans in complex situations with multiple goals that differ in importance, failure to fulfill the expectation of either role could undermine group performance due to misalignment of values between humans and robots. Specifically, a robot needs to serve as an effective listener to infer human users’ intents from instructions and feedback and as an expressive speaker to explain its decision processes to users. Here, we investigate how to foster effective bidirectional human-robot communications in the context of value alignment—collaborative robots and users form an aligned understanding of the importance of possible task goals. We propose an explainable artificial intelligence (XAI) system in which a group of robots predicts users’ values by taking in situ feedback into consideration while communicating their decision processes to users through explanations. To learn from human feedback, our XAI system integrates a cooperative communication model for inferring human values associated with multiple desirable goals. To be interpretable to humans, the system simulates human mental dynamics and predicts optimal explanations using graphical models. We conducted psychological experiments to examine the core components of the proposed computational framework. Our results show that real-time human-robot mutual understanding in complex cooperative tasks is achievable with a learning model based on bidirectional communication. We believe that this interaction framework can shed light on bidirectional value alignment in communicative XAI systems and, more broadly, in future human-machine teaming systems. 
    more » « less
  3. Understanding players' mental models are crucial for game designers who wish to successfully integrate player-AI interactions into their game. However, game designers face the difficult challenge of anticipating how players model these AI agents during gameplay and how they may change their mental models with experience. In this work, we conduct a qualitative study to examine how a pair of players develop mental models of an adversarial AI player during gameplay in the multiplayer drawing game iNNk. We conducted ten gameplay sessions in which two players (n = 20, 10 pairs) worked together to defeat an AI player. As a result of our analysis, we uncovered two dominant dimensions that describe players' mental model development (i.e., focus and style). The first dimension describes the focus of development which refers to what players pay attention to for the development of their mental model (i.e., top-down vs. bottom-up focus). The second dimension describes the differences in the style of development, which refers to how players integrate new information into their mental model (i.e., systematic vs. reactive style). In our preliminary framework, we further note how players process a change when a discrepancy occurs, which we observed occur through comparisons (i.e., compare to other systems, compare to gameplay, compare to self). We offer these results as a preliminary framework for player mental model development to help game designers anticipate how different players may model adversarial AI players during gameplay. 
    more » « less
  4. AI-enabled agents designed to assist humans are gaining traction in a variety of domains such as healthcare and disaster response. It is evident that, as we move forward, these agents will play increasingly vital roles in our lives. To realize this future successfully and mitigate its unintended consequences, it is imperative that humans have a clear understanding of the agents that they work with. Policy summarization methods help facilitate this understanding by showcasing key examples of agent behaviors to their human users. Yet, existing methods produce “one-size-fits-all” summaries for a generic audience ahead of time. Drawing inspiration from research in pedagogy, we posit that personalized policy summaries can more effectively enhance user understanding. To evaluate this hypothesis, this paper presents and benchmarks a novel technique: Personalized Policy Summarization (PPS). PPS discerns a user’s mental model of the agent through a series of algorithmically generated questions and crafts customized policy summaries to enhance user understanding. Unlike existing methods, PPS actively engages with users to gauge their comprehension of the agent behavior, subsequently generating tailored explanations on the fly. Through a combination of numerical and human subject experiments, we confirm the utility of this personalized approach to explainable AI. 
    more » « less
  5. Collaborative tasks often begin with partial task knowledge and incomplete initial plans from each partner. To complete these tasks, agents need to engage in situated communication with their partners and coordinate their partial plans towards a complete plan to achieve a joint task goal. While such collaboration seems effortless in a human-human team, it is highly challenging for human-AI collaboration. To address this limitation, this paper takes a step towards collaborative plan acquisition, where humans and agents strive to learn and communicate with each other to acquire a complete plan for joint tasks. Specifically, we formulate a novel problem for agents to predict the missing task knowledge for themselves and for their partners based on rich perceptual and dialogue history. We extend a situated dialogue benchmark for symmetric collaborative tasks in a 3D blocks world and investigate computational strategies for plan acquisition. Our empirical results suggest that predicting the partner's missing knowledge is a more viable approach than predicting one's own. We show that explicit modeling of the partner's dialogue moves and mental states produces improved and more stable results than without. These results provide insight for future AI agents that can predict what knowledge their partner is missing and, therefore, can proactively communicate such information to help their partner acquire such missing knowledge toward a common understanding of joint tasks. 
    more » « less