Proper calibration of human reliance on AI is fundamental to achieving complementary performance in AI-assisted human decision-making. Most previous works focused on assessing user reliance, and more broadly trust, retrospectively, through user perceptions and task-based measures. In this work, we explore the relationship between eye gaze and reliance under varying task difficulties and AI performance levels in a spatial reasoning task. Our results show a strong positive correlation between percent gaze duration on the AI suggestion and user AI task agreement, as well as user perceived reliance. Moreover, user agency is preserved particularly when the task is easy and when AI performance is low or inconsistent. Our results also reveal nuanced differences between reliance and trust. We discuss the potential of using eye gaze to gauge human reliance on AI in real-time, enabling adaptive AI assistance for optimal human-AI team performance.
more »
« less
Predicting Human Activities from User-Generated Content
The activities we do are linked to our interests, personality, political preferences, and decisions we make about the future. In this paper, we explore the task of predicting human activities from user-generated content. We collect a dataset containing instances of social media users writing about a range of everyday activities. We then use a state-of-the-art sentence embedding framework tailored to recognize the semantics of human activities and perform an automatic clustering of these activities. We train a neural network model to make predictions about which clusters contain activities that were performed by a given user based on the text of their previous posts and self-description. Additionally, we explore the degree to which incorporating inferred user traits into our model helps with this prediction task.
more »
« less
- Award ID(s):
- 1815291
- PAR ID:
- 10111344
- Date Published:
- Journal Name:
- Association for Computational Linguistics
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We explore task tolerances, i.e., allowable position or rotation inaccuracy, as an important resource to facilitate smooth and effective telemanipulation. Task tolerances provide a robot flexibility to generate smooth and feasible motions; however, in teleoperation, this flexibility may make the user’s control less direct. In this work, we implemented a telema- nipulation system that allows a robot to autonomously adjust its configuration within task tolerances. We conducted a user study comparing a telemanipulation paradigm that exploits task tolerances (functional mimicry) to a paradigm that requires the robot to exactly mimic its human operator (exact mimicry), and assess how the choice in paradigm shapes user experience and task performance. Our results show that autonomous adjustments within task tolerances can lead to performance improvements without sacrificing perceived control of the robot. Additionally, we find that users perceive the robot to be more under control, predictable, fluent, and trustworthy in functional mimicry than in exact mimicry.more » « less
-
Understanding what sequence of steps are needed to complete a goal can help artificial intelligence systems reason about human activities. Past work in NLP has examined the task of goal-step inference for text. We introduce the visual analogue. We propose the Visual Goal-Step Inference (VGSI) task, where a model is given a textual goal and must choose which of four images represents a plausible step towards that goal. With a new dataset harvested from wikiHow consisting of 772,277 images representing human actions, we show that our task is challenging for state-of-the-art multimodal models. Moreover, the multimodal representation learned from our data can be effectively transferred to other datasets like HowTo100m, increasing the VGSI accuracy by 15 - 20%. Our task will facilitate multimodal reasoning about procedural events.more » « less
-
Previous research has shown how statistical model checking can be used with human task behavior modeling and human reliability analysis to make realistic predictions about human errors and error rates. However, these efforts have not accounted for the impact that design changes can have on human reliability. In this research, we address this deficiency by using similarity theory from human cognitive modeling. This replicates how negative transfer can cause people to perform old task behaviors on modified systems. We present details about how this approach was realized with the PRISM model checker and the enhanced operator function model. We report results of a validation exercise using an application from the literature. We discuss the implications of our results and describe future research.more » « less
-
Evaluating the quality of accessible image captions with human raters is difficult, as it may be difficult for a visually impaired user to know how comprehensive a caption is, whereas a sighted assistant may not know what information a user will need from a caption. To explore how image captioners and caption consumers assess caption content, we conducted a series of collaborative captioning sessions in which six pairs, consisting of a blind person and their sighted partner, worked together to discuss, create, and evaluate image captions. By making captioning a collaborative task, we were able to observe captioning strategies, to elicit questions and answers about image captions, and to explore blind users’ caption preferences. Our findings provide insight about the process of creating good captions and serve as a case study for cross-ability collaboration between blind and sighted people.more » « less