Hypothesis Formalization: Empirical Findings, Software Limitations, and Design Implications

Jun, Eunice; Birchfield, Melissa; De Moura, Nicole; Heer, Jeffrey; Just, René

doi:10.1145/3476980

Citation Details

Hypothesis Formalization: Empirical Findings, Software Limitations, and Design Implications

Data analysis requires translating higher level questions and hypotheses into computable statistical models. We present a mixed-methods study aimed at identifying the steps, considerations, and challenges involved in operationalizing hypotheses into statistical models, a process we refer to as hypothesis formalization . In a formative content analysis of 50 research papers, we find that researchers highlight decomposing a hypothesis into sub-hypotheses, selecting proxy variables, and formulating statistical models based on data collection design as key steps. In a lab study, we find that analysts fixated on implementation and shaped their analyses to fit familiar approaches, even if sub-optimal. In an analysis of software tools, we find that tools provide inconsistent, low-level abstractions that may limit the statistical models analysts use to formalize hypotheses. Based on these observations, we characterize hypothesis formalization as a dual-search process balancing conceptual and statistical considerations constrained by data and computation and discuss implications for future tools. more »

Award ID(s):: 1901386

PAR ID:: 10355052

Author(s) / Creator(s):: Jun, Eunice; Birchfield, Melissa; De Moura, Nicole; Heer, Jeffrey; Just, René

Date Published:: 2022-02-28

Journal Name:: ACM Transactions on Computer-Human Interaction

Volume:: 29

Issue:: 1

ISSN:: 1073-0516

Page Range / eLocation ID:: 1 to 28

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1145/3476980

More Like this