Ferret: Reviewing Tabular Datasets for Manipulation

Lange, Devin  (ORCID:0000000234670294); Sahai, Shaurya  (ORCID:0000000170367412); Phillips, Jeff M.  (ORCID:0000000311692965); Lex, Alexander  (ORCID:0000000169305468)

doi:10.1111/cgf.14822

Citation Details

Ferret: Reviewing Tabular Datasets for Manipulation

Abstract How do we ensure the veracity of science? The act of manipulating or fabricating scientific data has led to many high‐profile fraud cases and retractions. Detecting manipulated data, however, is a challenging and time‐consuming endeavor. Automated detection methods are limited due to the diversity of data types and manipulation techniques. Furthermore, patterns automatically flagged as suspicious can have reasonable explanations. Instead, we propose a nuanced approach where experts analyze tabular datasets, e.g., as part of the peer‐review process, using a guided, interactive visualization approach. In this paper, we present an analysis of how manipulated datasets are created and the artifacts these techniques generate. Based on these findings, we propose a suite of visualization methods to surface potential irregularities. We have implemented these methods in Ferret, a visualization tool for data forensics work. Ferret makes potential data issues salient and provides guidance on spotting signs of tampering and differentiating them from truthful data. more »

Award ID(s):: 1751238

PAR ID:: 10426691

Author(s) / Creator(s):: Lange, Devin ; Sahai, Shaurya ; Phillips, Jeff M. ; Lex, Alexander

Publisher / Repository:: Wiley-Blackwell

Date Published:: 2023-06-27

Journal Name:: Computer Graphics Forum

Volume:: 42

Issue:: 3

ISSN:: 0167-7055

Page Range / eLocation ID:: p. 187-198

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Journal Article:
https://doi.org/10.1111/cgf.14822

More Like this