Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Improving our understanding of how humans perceive AI teammates is an important foundation for our general understanding of human-AI teams. Extending relevant work from cognitive science, we propose a framework based on item response theory for modeling these perceptions. We apply this framework to real-world experiments, in which each participant works alongside another person or an AI agent in a question-answering setting, repeatedly assessing their teammate’s performance. Using this experimental data, we demonstrate the use of our framework for testing research questions about people’s perceptions of both AI agents and other people. We contrast mental models of AI teammates with those of human teammates as we characterize the dimensionality of these mental models, their development over time, and the influence of the participants’ own self-perception. Our results indicate that people expect AI agents’ performance to be significantly better on average than the performance of other humans, with less variation across different types of problems. We conclude with a discussion of the implications of these findings for human-AI interaction.more » « less
-
Artificial intelligence (AI) has the potential to improve human decision-making by providing decision recommendations and problem-relevant information to assist human decision-makers. However, the full realization of the potential of human–AI collaboration continues to face several challenges. First, the conditions that support complementarity (i.e., situations in which the performance of a human with AI assistance exceeds the performance of an unassisted human or the AI in isolation) must be understood. This task requires humans to be able to recognize situations in which the AI should be leveraged and to develop new AI systems that can learn to complement the human decision-maker. Second, human mental models of the AI, which contain both expectations of the AI and reliance strategies, must be accurately assessed. Third, the effects of different design choices for human-AI interaction must be understood, including both the timing of AI assistance and the amount of model information that should be presented to the human decision-maker to avoid cognitive overload and ineffective reliance strategies. In response to each of these three challenges, we present an interdisciplinary perspective based on recent empirical and theoretical findings and discuss new research directions.more » « less
-
Abstract AI assistance is readily available to humans in a variety of decision-making applications. In order to fully understand the efficacy of such joint decision-making, it is important to first understand the human’s reliance on AI. However, there is a disconnect between how joint decision-making is studied and how it is practiced in the real world. More often than not, researchers ask humans to provide independent decisions before they are shown AI assistance. This is done to make explicit the influence of AI assistance on the human’s decision. We develop a cognitive model that allows us to infer the
latent reliance strategy of humans on AI assistance without asking the human to make an independent decision. We validate the model’s predictions through two behavioral experiments. The first experiment follows aconcurrent paradigm where humans are shown AI assistance alongside the decision problem. The second experiment follows asequential paradigm where humans provide an independent judgment on a decision problem before AI assistance is made available. The model’s predicted reliance strategies closely track the strategies employed by humans in the two experimental paradigms. Our model provides a principled way to infer reliance on AI-assistance and may be used to expand the scope of investigation on human-AI collaboration. -
Artificial intelligence (AI) and machine learning models are being increasingly deployed in real-world applications. In many of these applications, there is strong motivation to develop hybrid systems in which humans and AI algorithms can work together, leveraging their complementary strengths and weaknesses. We develop a Bayesian framework for combining the predictions and different types of confidence scores from humans and machines. The framework allows us to investigate the factors that influence complementarity, where a hybrid combination of human and machine predictions leads to better performance than combinations of human or machine predictions alone. We apply this framework to a large-scale dataset where humans and a variety of convolutional neural networks perform the same challenging image classification task. We show empirically and theoretically that complementarity can be achieved even if the human and machine classifiers perform at different accuracy levels as long as these accuracy differences fall within a bound determined by the latent correlation between human and machine classifier confidence scores. In addition, we demonstrate that hybrid human–machine performance can be improved by differentiating between the errors that humans and machine classifiers make across different class labels. Finally, our results show that eliciting and including human confidence ratings improve hybrid performance in the Bayesian combination model. Our approach is applicable to a wide variety of classification problems involving human and machine algorithms.more » « less
-
An increasingly common use case for machine learning models is augmenting the abilities of human decision makers. For classification tasks where neither the human nor model are perfectly accurate, a key step in obtaining high performance is combining their individual predictions in a manner that leverages their relative strengths. In this work, we develop a set of algorithms that combine the probabilistic output of a model with the class-level output of a human. We show theoretically that the accuracy of our combination model is driven not only by the individual human and model accuracies, but also by the model's confidence. Empirical results on image classification with CIFAR-10 and a subset of ImageNet demonstrate that such human-model combinations consistently have higher accuracies than the model or human alone, and that the parameters of the combination method can be estimated effectively with as few as ten labeled datapoints.more » « less
-
Abstract Typical fMRI studies have focused on either the mean trend in the blood‐oxygen‐level‐dependent (BOLD) time course or functional connectivity (FC). However, other statistics of the neuroimaging data may contain important information. Despite studies showing links between the variance in the BOLD time series (BV) and age and cognitive performance, a formal framework for testing these effects has not yet been developed. We introduce the variance design general linear model (VDGLM), a novel framework that facilitates the detection of variance effects. We designed the framework for general use in any fMRI study by modeling both mean and variance in BOLD activation as a function of experimental design. The flexibility of this approach allows the VDGLM to (a) simultaneously make inferences about a mean or variance effect while controlling for the other and (b) test for variance effects that could be associated with multiple conditions and/or noise regressors. We demonstrate the use of the VDGLM in a working memory application and show that engagement in a working memory task is associated with whole‐brain decreases in BOLD variance.