skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on June 18, 2026

Title: Quality Control for Quality Computational Concepts: Wrangling With Theory and Data Wrangling as Theorizing
Award ID(s):
1934313
PAR ID:
10584081
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Networks are a natural way of thinking about many datasets. The data on which a network is based, however, is rarely collected in a form that suits the analysis process, making it necessary to create and reshape networks. Data wrangling is widely acknowledged to be a critical part of the data analysis pipeline, yet interactive network wrangling has received little attention in the visualization research community. In this paper, we discuss a set of operations that are important for wrangling network datasets and introduce a visual data wrangling tool, Origraph, that enables analysts to apply these operations to their datasets. Key operations include creating a network from source data such as tables, reshaping a network by introducing new node or edge classes, filtering nodes or edges, and deriving new node or edge attributes. Our tool, Origraph, enables analysts to execute these operations with little to no programming, and to immediately visualize the results. Origraph provides views to investigate the network model, a sample of the network, and node and edge attributes. In addition, we introduce interfaces designed to aid analysts in specifying arguments for sensible network wrangling operations. We demonstrate the usefulness of Origraph in two Use Cases: first, we investigate gender bias in the film industry, and then the influence of money on the political support for the war in Yemen. 
    more » « less
  2. Ethical engagement is central to the practice of design, impacting stakeholders across and beyond technology organizations as well as producing downstream social and environmental impacts. Scholars have previously described the ecologically-mediated nature of ethics in practice as a manifestation of “ethical design complexity;” however, the means of addressing this complexity is under-explored. In this provocation, we build on three years of prior empirical work on ethics and design practice to propose three ways of “wrangling” ethical design complexity: 1) articulating and interrogating complexity through constructed ethical dilemmas; 2) identifying potentially binding constraints through ethical tensions; and 3) describing and traversing naturalistically-defined ethical situations. We leverage these three approaches to provoke further scholarship and ethically-engaged design work. 
    more » « less
  3. Data scientists reportedly spend 60 to 80 percent of their time in their daily routines on data wrangling, i.e. cleaning data and extracting features. However, data wrangling code is often repetitive and error-prone to write. Moreover, it is easy to introduce subtle bugs when reusing and adopting existing code, which result not in crashes but reduce model quality. To support data scientists with data wrangling, we present a technique to generate interactive documentation for data wrangling code. We use (1) program synthesis techniques to automatically summarize data transformations and (2) test case selection techniques to purposefully select representative examples from the data based on execution information collected with tailored dynamic program analysis. We demonstrate that a JupyterLab extension with our technique can provide documentation for many cells in popular notebooks and find in a user study that users with our plugin are faster and more effective at finding realistic bugs in data wrangling code. 
    more » « less