skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, October 10 until 2:00 AM ET on Friday, October 11 due to maintenance. We apologize for the inconvenience.


Title: A User-based Visual Analytics Workflow for Exploratory Model Analysis
Many visual analytics systems allow users to interact with machine learning models towards the goals of data exploration and insight generation on a given dataset. However, in some situations, insights may be less important than the production of an accurate predictive model for future use. In that case, users are more interested in generating of diverse and robust predictive models, verifying their performance on holdout data, and selecting the most suitable model for their usage scenario. In this paper, we consider the concept of Exploratory Model Analysis (EMA), which is defined as the process of discovering and selecting relevant models that can be used to make predictions on a data source. We delineate the differences between EMA and the well‐known term exploratory data analysis in terms of the desired outcome of the analytic process: insights into the data or a set of deployable models. The contributions of this work are a visual analytics system workflow for EMA, a user study, and two use cases validating the effectiveness of the workflow. We found that our system workflow enabled users to generate complex models, to assess them for various qualities, and to select the most relevant model for their task.  more » « less
Award ID(s):
1841349
NSF-PAR ID:
10104090
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Computer graphics forum
Volume:
38
Issue:
3
ISSN:
1467-8659
Page Range / eLocation ID:
185-199
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Exploratory data analysis of high-dimensional datasets is a crucial task for which visual analytics can be especially useful. However, the ad hoc nature of exploratory analysis can also lead users to draw incorrect causal inferences. Previous studies have demonstrated this risk and shown that integrating counterfactual concepts within visual analytics systems can improve users’ understanding of visualized data. However, effectively leveraging counterfactual concepts can be challenging, with only bespoke implementations found in prior work. Moreover, it can require expertise in both counterfactual subset analysis and visualization to implement the functionalities practically. This paper aims to help address these challenges in two ways. First, we propose an operator-based conceptual model for the use of counterfactuals that is informed by prior work in visualization research. Second, we contribute the Co-op library, an open and extensible reference implementation of this model that can support the integration of counterfactual-based subset computation with visualization systems. To evaluate the effectiveness and generalizability of Co-op, the library was used to construct two different visual analytics systems each supporting a distinct user workflow. In addition, expert interviews were conducted with professional visual analytics researchers and engineers to gain more insights regarding how Co-op could be leveraged. Finally, informed in part by these evaluation results, we distil a set of key design implications for effectively leveraging counterfactuals in future visualization systems. 
    more » « less
  2. Data visualization provides a powerful way for analysts to explore and make data-driven discoveries. However, current visual analytic tools provide only limited support for hypothesis-driven inquiry, as their built-in interactions and workflows are primarily intended for exploratory analysis. Visualization tools notably lack capabilities that would allow users to visually and incrementally test the fit of their conceptual models and provisional hypotheses against the data. This imbalance could bias users to overly rely on exploratory analysis as the principal mode of inquiry, which can be detrimental to discovery. In this paper, we introduce Visual (dis) Confirmation, a tool for conducting confirmatory, hypothesis-driven analyses with visualizations. Users interact by framing hypotheses and data expectations in natural language. The system then selects conceptually relevant data features and automatically generates visualizations to validate the underlying expectations. Distinctively, the resulting visualizations also highlight places where one's mental model disagrees with the data, so as to stimulate reflection. The proposed tool represents a new class of interactive data systems capable of supporting confirmatory visual analysis, and responding more intelligently by spotlighting gaps between one's knowledge and the data. We describe the algorithmic techniques behind this workflow. We also demonstrate the utility of the tool through a case study. 
    more » « less
  3. null (Ed.)
    Visual analytics systems enable highly interactive exploratory data analysis. Across a range of fields, these technologies have been successfully employed to help users learn from complex data. However, these same exploratory visualization techniques make it easy for users to discover spurious findings. This paper proposes new methods to monitor a user’s analytic focus during visual analysis of structured datasets and use it to surface relevant articles that contextualize the visualized findings. Motivated by interactive analyses of electronic health data, this paper introduces a formal model of analytic focus, a computational approach to dynamically update the focus model at the time of user interaction, and a prototype application that leverages this model to surface relevant medical publications to users during visual analysis of a large corpus of medical records. Evaluation results with 24 users show that the modeling approach has high levels of accuracy and is able to surface highly relevant medical abstracts. 
    more » « less
  4. Visualization tools facilitate exploratory data analysis, but fall short at supporting hypothesis-based reasoning. We conducted an exploratory study to investigate how visualizations might support a concept-driven analysis style, where users can optionally share their hypotheses and conceptual models in natural language, and receive customized plots depicting the fit of their models to the data. We report on how participants leveraged these unique affordances for visual analysis. We found that a majority of participants articulated meaningful models and predictions, utilizing them as entry points to sensemaking. We contribute an abstract typology representing the types of models participants held and externalized as data expectations. Our findings suggest ways for rearchitecting visual analytics tools to better support hypothesis- and model-based reasoning, in addition to their traditional role in exploratory analysis. We discuss the design implications and reflect on the potential benefits and challenges involved. 
    more » « less
  5. Many domains require analyst expertise to determine what patterns and data are interesting in a corpus. However, most analytics tools attempt to prequalify “interestingness” using algorithmic approaches to provide exploratory overviews. This overview-driven workflow precludes the use of qualitative analysis methodologies in large datasets. This paper discusses a preliminary visual analytics approach demonstrating how visual analytics tools can instead enable expert-driven qualitative analyses at scale by supporting computer-in-the-loop mixed-initiative approaches. We argue that visual analytics tools can support rich qualitative inference by using machine learning methods to continually model and refine what features correlate to an analyst’s on-going qualitative observations and by providing transparency into these features in order to aid analysts in navigating large corpora during qualitative analyses. We illustrate these ideas through an example from social media analysis and discuss open opportunities for designing visualizations that support qualitative inference through computer-in-the-loop approaches. 
    more » « less