skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Inferential Tasks as an Evaluation Technique for Visualization
Designing suitable tasks for visualization evaluation remains challenging. Traditional evaluation techniques commonly rely on 'low-level' or 'open-ended' tasks to assess the efficacy of a proposed visualization, however, nontrivial trade-offs exist between the two. Low-level tasks allow for robust quantitative evaluations, but are not indicative of the complex usage of a visualization. Open-ended tasks, while excellent for insight-based evaluations, are typically unstructured and require time-consuming interviews. Bridging this gap, we propose inferential tasks: a complementary task category based on inferential learning in psychology. Inferential tasks produce quantitative evaluation data in which users are prompted to form and validate their own findings with a visualization. We demonstrate the use of inferential tasks through a validation experiment on two well-known visualization tools.  more » « less
Award ID(s):
1939945
PAR ID:
10394300
Author(s) / Creator(s):
; ; ; ; ; ;
Editor(s):
Agus, Marco; Aigner, Wolfgang; Hoellt, Thomas
Date Published:
Journal Name:
EuroVis 2022 Short Papers
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. As major progress is made in open-ended text generation, measuring how close machine-generated text is to human language remains a critical open problem. We introduce MAUVE, a comparison measure for open-ended text generation, which directly compares the learnt distribution from a text generation model to the distribution of human-written text using divergence frontiers. MAUVE scales up to modern text generation models by computing information divergences in a quantized embedding space. Through an extensive empirical study on three open-ended generation tasks, we find that MAUVE identifies known properties of generated text, scales naturally with model size, and correlates with human judgments, with fewer restrictions than existing distributional evaluation metrics. 
    more » « less
  2. Abstract Humans often communicate using body movements like winks, waves, and nods. However, it is unclear how we identify when someone’s physical actions are communicative. Given people’s propensity to interpret each other’s behavior as aimed to produce changes in the world, we hypothesize that people expect communicative actions to efficiently reveal that they lack an external goal. Using computational models of goal inference, we predict that movements that are unlikely to be produced when acting towards the world and, in particular, repetitive ought to be seen as communicative. We find support for our account across a variety of paradigms, including graded acceptability tasks, forced-choice tasks, indirect prompts, and open-ended explanation tasks, in both market-integrated and non-market-integrated communities. Our work shows that the recognition of communicative action is grounded in an inferential process that stems from fundamental computations shared across different forms of action interpretation. 
    more » « less
  3. null (Ed.)
    Open-ended programming increases students' motivation by allowing them to solve authentic problems and connect programming to their own interests. However, such open-ended projects are also challenging, as they often encourage students to explore new programming features and attempt tasks that they have not learned before. Code examples are effective learning materials for students and are well-suited to supporting open-ended programming. However, there is little work to understand how novices learn with examples during open-ended programming, and few real-world deployments of such tools. In this paper, we explore novices' learning barriers when interacting with code examples during open-ended programming. We deployed Example Helper, a tool that offers galleries of code examples to search and use, with 44 novice students in an introductory programming classroom, working on an open-ended project in Snap. We found three high-level barriers that novices encountered when using examples: decision, search, and integration barriers. We discuss how these barriers arise and design opportunities to address them. 
    more » « less
  4. Jodie Jenkinson, Susan Keen (Ed.)
    While visual literacy has been identified as a foundational skill in life science education, there are many challenges in teaching and assessing biomolecular visualization skills. Among these are the lack of consensus about what constitutes competence and limited understanding of student and instructor perceptions of visual literacy tasks. In this study, we administered a set of biomolecular visualization assessments, developed as part of the BioMolViz project, to both students and instructors at multiple institutions and compared their perceptions of task difficulty. We then analyzed our findings using a mixed-methods approach. Quantitative analysis was used to answer the following research questions: (1) Which assessment items exhibit statistically significant disparities or agreements in perceptions of difficulty between instructors and students? (2) Do these perceptions persist when controlling for race/ethnicity and gender? and (3) How does student perception of difficulty relate to performance? Qualitative analysis of open-ended comments was used to identify predominant themes related to visual problem solving. The results show that perceptions of difficulty significantly differ between students and instructors and that students’ performance is a significant predictor of their perception of difficulty. Overall, this study underscores the need to incorporate deliberate instruction in visualization into undergraduate life science curricula to improve student ability in this area. Accordingly, we offer recommendations to promote visual literacy skills in the classroom. 
    more » « less
  5. null (Ed.)
    Automated event extraction in social science applications often requires corpus-level evaluations: for example, aggregating text predictions across metadata and unbiased estimates of recall. We combine corpus-level evaluation requirements with a real-world, social science setting and introduce the IndiaPoliceEvents corpus—all 21,391 sentences from 1,257 English-language Times of India articles about events in the state of Gujarat during March 2002. Our trained annotators read and label every document for mentions of police activity events, allowing for unbiased recall evaluations. In contrast to other datasets with structured event representations, we gather annotations by posing natural questions, and evaluate off-the-shelf models for three different tasks: sentence classification, document ranking, and temporal aggregation of target events. We present baseline results from zero-shot BERT-based models fine-tuned on natural language inference and passage retrieval tasks. Our novel corpus-level evaluations and annotation approach can guide creation of similar social-science-oriented resources in the future. 
    more » « less