skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, October 10 until 2:00 AM ET on Friday, October 11 due to maintenance. We apologize for the inconvenience.


Title: Reusing Interactive Analysis Workflows
Abstract

Interactive visual analysis has many advantages, but an important disadvantage is that analysis processes and workflows cannot be easily stored and reused. This is in contrast to code‐based analysis workflows, which can simply be run on updated datasets, and adapted when necessary. In this paper, we introduce methods to capture workflows in interactive visualization systems for different interactions such as selections, filters, categorizing/grouping, labeling, and aggregation. These workflows can then be applied to updated datasets, making interactive visualization sessions reusable. We demonstrate this specification using an interactive visualization system that tracks interaction provenance, and allows generating workflows from the recorded actions. The system can then be used to compare different versions of datasets and apply workflows to them. Finally, we introduce a Python library that can load workflows and apply it to updated datasets directly in a computational notebook, providing a seamless bridge between computational workflows and interactive visualization tools.

 
more » « less
Award ID(s):
1751238
NSF-PAR ID:
10406062
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Computer Graphics Forum
Volume:
41
Issue:
3
ISSN:
0167-7055
Page Range / eLocation ID:
p. 133-144
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Supporting the interactive exploration of large datasets is a popular and challenging use case for data management systems. Traditionally, the interface and the back-end system are built and optimized separately, and interface design and system optimization require different skill sets that are difficult for one person to master. To enable analysts to focus on visualization design, we contribute VegaPlus, a system that automatically optimizes interactive dashboards to support large datasets. To achieve this, VegaPlus leverages two core ideas. First, we introduce an optimizer that can reason about execution plans in Vega, a back-end DBMS, or a mix of both environments. The optimizer also considers how user interactions may alter execution plan performance, and can partially or fully rewrite the plans when needed. Through a series of benchmark experiments on seven different dashboard designs, our results show that VegaPlus provides superior performance and versatility compared to standard dashboard optimization techniques.

     
    more » « less
  2. Abstract Summary

    Gos is a declarative Python library designed to create interactive multiscale visualizations of genomics and epigenomics data. It provides a consistent and simple interface to the flexible Gosling visualization grammar. Gos hides technical complexities involved with configuring web-based genome browsers and integrates seamlessly within computational notebooks environments to enable new interactive analysis workflows.

    Availability and implementation

    Gos is released under the MIT License and available on the Python Package Index (PyPI). The source code is publicly available on GitHub (https://github.com/gosling-lang/gos), and documentation with examples can be found at https://gosling-lang.github.io/gos.

     
    more » « less
  3. The mapper algorithm is a popular tool from topological data analysis for extracting topological summaries of high-dimensional datasets. In this paper, we present Mapper Interactive, a web-based framework for the interactive analysis and visualization of high-dimensional point cloud data. It implements the mapper algorithm in an interactive, scalable, and easily extendable way, thus supporting practical data analysis. In particular, its command-line API can compute mapper graphs for 1 million points of 256 dimensions in about 3 minutes (4 times faster than the vanilla implementation). Its visual interface allows on-the-fly computation and manipulation of the mapper graph based on user-specified parameters and supports the addition of new analysis modules with a few lines of code. Mapper Interactive makes the mapper algorithm accessible to nonspecialists and accelerates topological analytics workflows. 
    more » « less
  4. Data visualization provides a powerful way for analysts to explore and make data-driven discoveries. However, current visual analytic tools provide only limited support for hypothesis-driven inquiry, as their built-in interactions and workflows are primarily intended for exploratory analysis. Visualization tools notably lack capabilities that would allow users to visually and incrementally test the fit of their conceptual models and provisional hypotheses against the data. This imbalance could bias users to overly rely on exploratory analysis as the principal mode of inquiry, which can be detrimental to discovery. In this paper, we introduce Visual (dis) Confirmation, a tool for conducting confirmatory, hypothesis-driven analyses with visualizations. Users interact by framing hypotheses and data expectations in natural language. The system then selects conceptually relevant data features and automatically generates visualizations to validate the underlying expectations. Distinctively, the resulting visualizations also highlight places where one's mental model disagrees with the data, so as to stimulate reflection. The proposed tool represents a new class of interactive data systems capable of supporting confirmatory visual analysis, and responding more intelligently by spotlighting gaps between one's knowledge and the data. We describe the algorithmic techniques behind this workflow. We also demonstrate the utility of the tool through a case study. 
    more » « less
  5. Abstract

    The field of connectomics aims to reconstruct the wiring diagram of Neurons and synapses to enable new insights into the workings of the brain. Reconstructing and analyzing the Neuronal connectivity, however, relies on many individual steps, starting from high‐resolution data acquisition to automated segmentation, proofreading, interactive data exploration, and circuit analysis. All of these steps have to handle large and complex datasets and rely on or benefit from integrated visualization methods. In this state‐of‐the‐art report, we describe visualization methods that can be applied throughout the connectomics pipeline, from data acquisition to circuit analysis. We first define the different steps of the pipeline and focus on how visualization is currently integrated into these steps. We also survey open science initiatives in connectomics, including usable open‐source tools and publicly available datasets. Finally, we discuss open challenges and possible future directions of this exciting research field.

     
    more » « less