NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Trust Junk and Evil Knobs: Calibrating Trust in AI Visualization

https://doi.org/10.1109/PacificVis60374.2024.00012

Wall, Emily; Matzen, Laura; El-Assady, Mennatallah; Masters, Peta; Hosseinpour, Helia; Endert, Alex; Borgo, Rita; Chau, Polo; Perer, Adam; Schupp, Harald; et al (April 2024, IEEE)

Many papers make claims about specific visualization techniques that are said to enhance or calibrate trust in AI systems. But a design choice that enhances trust in some cases appears to damage it in others. In this paper, we explore this inherent duality through an analogy with “knobs”. Turning a knob too far in one direction may result in under-trust, too far in the other, over-trust or, turned up further still, in a confusing distortion. While the designs or so-called “knobs” are not inherently evil, they can be misused or used in an adversarial context and thereby manipulated to mislead users or promote unwarranted levels of trust in AI systems. When a visualization that has no meaningful connection with the underlying model or data is employed to enhance trust, we refer to the result as “trust junk.” From a review of 65 papers, we identify nine commonly made claims about trust calibration. We synthesize them into a framework of knobs that can be used for good or “evil,” and distill our findings into observed pitfalls for the responsible design of human-AI systems.
more » « less
Full Text Available
Sketching AI Concepts with Capabilities and Examples: AI Innovation in the Intensive Care Unit

https://doi.org/10.1145/3613904.3641896

Yildirim, Nur; Zlotnikov, Susanna; Sayar, Deniz; Kahn, Jeremy M; Bukowski, Leigh A; Amin, Sher Shah; Riman, Kathryn A; Davis, Billie S; Minturn, John S; King, Andrew J; et al (May 2024, ACM)
Mueller, Florian Floyd; Kyburz, Penny; Williamson, Julie R; Sas, Corina; Wilson, Max L; Dugas, Phoebe Toups; Shklovski, Irina (Ed.)
Advances in artificial intelligence (AI) have enabled unprecedented capabilities, yet innovation teams struggle when envisioning AI concepts. Data science teams think of innovations users do not want, while domain experts think of innovations that cannot be built. A lack of effective ideation seems to be a breakdown point. How might multidisciplinary teams identify buildable and desirable use cases? This paper presents a first hand account of ideating AI concepts to improve critical care medicine. As a team of data scientists, clinicians, and HCI researchers, we conducted a series of design workshops to explore more effective approaches to AI concept ideation and problem formulation. We detail our process, the challenges we encountered, and practices and artifacts that proved effective. We discuss the research implications for improved collaboration and stakeholder engagement, and discuss the role HCI might play in reducing the high failure rate experienced in AI innovation.
more » « less
Full Text Available
Zeno: An Interactive Framework for Behavioral Evaluation of Machine Learning

https://doi.org/10.1145/3544548.3581268

Cabrera, Ángel Alexander; Fu, Erica; Bertucci, Donald; Holstein, Kenneth; Talwalkar, Ameet; Hong, Jason I.; Perer, Adam (April 2023, ACM)
Improving Human-AI Collaboration With Descriptions of AI Behavior

https://doi.org/10.1145/3579612

Cabrera, Ángel_Alexander; Perer, Adam; Hong, Jason_I (April 2023, Proceedings of the ACM on Human-Computer Interaction)

People work with AI systems to improve their decision making, but often under- or over-rely on AI predictions and perform worse than they would have unassisted. To help people appropriately rely on AI aids, we propose showing them behavior descriptions, details of how AI systems perform on subgroups of instances. We tested the efficacy of behavior descriptions through user studies with 225 participants in three distinct domains: fake review detection, satellite image classification, and bird classification. We found that behavior descriptions can increase human-AI accuracy through two mechanisms: helping people identify AI failures and increasing people's reliance on the AI when it is more accurate. These findings highlight the importance of people's mental models in human-AI collaboration and show that informing people of high-level AI behaviors can significantly improve AI-assisted decision making.
more » « less
“Why Do I Care What’s Similar?” Probing Challenges in AI-Assisted Child Welfare Decision-Making through Worker-AI Interface Design Concepts

https://doi.org/10.1145/3532106.3533556

Kawakami, Anna; Sivaraman, Venkatesh; Stapleton, Logan; Cheng, Hao-Fei; Perer, Adam; Wu, Zhiwei Steven; Zhu, Haiyi; Holstein, Kenneth (June 2022, Designing Interactive Systems Conference)

Full Text Available
Improving Human-AI Partnerships in Child Welfare: Understanding Worker Practices, Challenges, and Desires for Algorithmic Decision Support

https://doi.org/10.1145/3491102.3517439

Kawakami, Anna; Sivaraman, Venkatesh; Cheng, Hao-Fei; Stapleton, Logan; Cheng, Yanghuidi; Qing, Diana; Perer, Adam; Wu, Zhiwei Steven; Zhu, Haiyi; Holstein, Kenneth (April 2022, Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems)

Full Text Available
How Child Welfare Workers Reduce Racial Disparities in Algorithmic Decisions

https://doi.org/10.1145/3491102.3501831

Cheng, Hao-Fei; Stapleton, Logan; Kawakami, Anna; Sivaraman, Venkatesh; Cheng, Yanghuidi; Qing, Diana; Perer, Adam; Holstein, Kenneth; Wu, Zhiwei Steven; Zhu, Haiyi (April 2022, Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems)

Full Text Available
Designing Alternative Representations of Confusion Matrices to Support Non-Expert Public Understanding of Algorithm Performance

https://doi.org/10.1145/3415224

SHEN, HONG; JIN, HAOJIAN; CABRERA, ÁNGEL A.; PERER, ADAM; ZHU, HAIYI; and HONG, JASON I. (October 2020, Proc. ACM Hum.-Comput. Interact.)

Ensuring effective public understanding of algorithmic decisions that are powered by machine learning techniques has become an urgent task with the increasing deployment of AI systems into our society. In this work, we present a concrete step toward this goal by redesigning confusion matrices for binary classification to support non-experts in understanding the performance of machine learning models. Through interviews (n=7) and a survey (n=102), we mapped out two major sets of challenges lay people have in understanding standard confusion matrices: the general terminologies and the matrix design. We further identified three sub-challenges regarding the matrix design, namely, confusion about the direction of reading the data, layered relations and quantities involved. We then conducted an online experiment with 483 participants to evaluate how effective a series of alternative representations target each of those challenges in the context of an algorithm for making recidivism predictions. We developed three levels of questions to evaluate users’ objective understanding. We assessed the effectiveness of our alternatives for accuracy in answering those questions, completion time, and subjective understanding. Our results suggest that (1) only by contextualizing terminologies can we significantly improve users’ understanding and (2) flow charts, which help point out the direction of reading the data, were most useful in improving objective understanding. Our findings set the stage for developing more intuitive and generally understandable representations of the performance of machine learning models
more » « less
Full Text Available
Discovering and Validating AI Errors With Crowdsourced Failure Reports

https://doi.org/10.1145/3479569

Cabrera, Ángel Alexander; Druck, Abraham J.; Hong, Jason I.; Perer, Adam (October 2021, Proceedings of the ACM on Human-Computer Interaction)

Full Text Available
SearchLens: composing and capturing complex user interests for exploratory search

https://doi.org/10.1145/3301275.3302321

Chang, Joseph Chee; Hahn, Nathan; Perer, Adam; Kittur, Aniket (January 2019, ACM IUI)

Whether figuring out where to eat in an unfamiliar city or deciding which apartment to live in, consumer generated data (ie reviews and forum posts) are often an important influence in online decision making. To make sense of these rich repositories of diverse opinions, searchers need to sift through a large number of reviews to characterize each item based on aspects that they care about. We introduce a novel system, SearchLens, where searchers build up a collection of “Lenses” that reflect their different latent interests, and compose the Lenses to find relevant items across different contexts. Based on the Lenses, SearchLens generates personalized interfaces with visual explanations that promotes transparency and enables deeper exploration. While prior work found searchers may not wish to put in effort specifying their goals without immediate and sufficient benefits, results from a controlled lab study suggest that our approach incentivized participants to express their interests more richly than in a baseline condition, and a field study showed that participants found benefits in SearchLens while conducting their own tasks.
more » « less
Full Text Available

« Prev Next »

Search for: All records