skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Thursday, February 13 until 2:00 AM ET on Friday, February 14 due to maintenance. We apologize for the inconvenience.


Title: Truth in a sea of data: adoption and use of data search tools among researchers and journalists
The increasing availability of data search tools brings opportunities for non-expert users. Among these users, interdisciplinary researchers and data journalists represent a growing population whose work can lead to societal benefit. Through in-depth interviews, we examine what strategies and approaches researchers and journalists adopt to search online data, how they apply current technology to facilitate dataset search, and the barriers and difficulties that they encounter in their work with data. Our findings reveal that with technological limitations in the aspects of searchability, interactivity and usability, dataset search for non-experts remains a challenge. We have found that little attention has been paid to non-experts’ emerging data need, significantly constraining the design and development of technological tools for supporting non-expert users. Our findings underline the critical impact of the design, development and deployment of technological tools to enable the meaningful use of today’s increasingly available data toward a civil society.  more » « less
Award ID(s):
1816325
PAR ID:
10393252
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Information, Communication & Society
ISSN:
1369-118X
Page Range / eLocation ID:
1 to 20
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The rise of automated text processing systems has led to the development of tools designed for a wide variety of application domains. These technologies are often developed to support non-technical users such as domain experts and are often developed in isolation of the tools primary user. While such developments are exciting, less attention has been paid to domain experts’ expectations about the values embedded in these automated systems. As a step toward addressing that gap, we examined values expectations of journalists and legal experts. Both these domains involve extensive text processing and place high importance on values in professional practice. We engaged participants from two non-profit organizations in two separate co-speculation design workshops centered around several speculative automated text processing systems. This study makes three interrelated contributions. First, we provide a detailed investigation of domain experts’ values expectations around future NLP systems. Second, the speculative design fiction concepts, which we specifically crafted for these investigative journalists and legal experts, illuminated a series of tensions around the technical implementation details of automation. Third, our findings highlight the utility of design fiction in eliciting not-to-design implications, not only about automated NLP but also about technology more broadly. Overall, our study findings provide groundwork for the inclusion of domain experts values whose expertise lies outside of the field of computing into the design of automated NLP systems. 
    more » « less
  2. The success of DL can be attributed to hours of parameter and architecture tuning by human experts. Neural Architecture Search (NAS) techniques aim to solve this problem by automating the search procedure for DNN architectures making it possible for non-experts to work with DNNs. Specifically, One-shot NAS techniques have recently gained popularity as they are known to reduce the search time for NAS techniques. One-Shot NAS works by training a large template network through parameter sharing which includes all the candidate NNs. This is followed by applying a procedure to rank its components through evaluating the possible candidate architectures chosen randomly. However, as these search models become increasingly powerful and diverse, they become harder to understand. Consequently, even though the search results work well, it is hard to identify search biases and control the search progression, hence a need for explainability and human-in-the-loop (HIL) One-Shot NAS. To alleviate these problems, we present NAS-Navigator, a visual analytics (VA) system aiming to solve three problems with One-Shot NAS; explainability, HIL design, and performance improvements compared to existing state-of-the-art (SOTA) techniques. NAS-Navigator gives full control of NAS back in the hands of the users while still keeping the perks of automated search, thus assisting non-expert users. Analysts can use their domain knowledge aided by cues from the interface to guide the search. Evaluation results confirm the performance of our improved One-Shot NAS algorithm is comparable to other SOTA techniques. While adding Visual Analytics (VA) using NAS-Navigator shows further improvements in search time and performance. We designed our interface in collaboration with several deep learning researchers and evaluated NAS-Navigator through a control experiment and expert interviews. 
    more » « less
  3. Investigative data journalists work with a variety of data sources to tell a story. Though prior work has indicated that there is a close relationship between journalists' data work practices and that of data scientists. However, these relationships and data work practices are not empirically examined, and understanding them is crucial to inform the design of tools that are used by different groups of people including data scientists and data journalists. Thus, to bridge this gap, we studied investigative reporters' data work practices with one non-profit investigative newsroom. Our study design includes two activities: 1) semi-structured interviews with journalists, and 2) a sketching activity allowing journalists to depict examples of their work practices. By analyzing these data and synthesizing them across related prior work, we propose the major phases in the data-driven investigative journalism story idea generation process. Our study findings show that the journalists employ a collection of multiple, iterative, cyclic processes to identify journalistically "interesting'' story ideas. These processes both significantly resemble and show subtle nuanced differences with data science work practices identified in prior research. We further verified our proposal through a member check with key informants. This work offers three primary contributions. First, it provides a close glimpse into the main phases of investigative journalists' data-driven story idea generation technique. Second, it complements prior work studying formal data science practices by examining data-driven investigative journalists, whose primary expertise lies outside computing. Third, it identifies particular points in the data exploration processes that would benefit from design interventions and suggests future research directions. 
    more » « less
  4. Abstract

    Recent studies have indicated that visually embellished charts such as infographics have the ability to engage viewers and positively affect memorability. Fueled by these findings, researchers have proposed a variety of infographic design tools. However, these tools do not cover the entire design space. In this work, we identify a subset of infographics that we call infomages. Infomages are casual visuals of data in which a data chart is embedded into a thematic image such that the content of the image reflects the subject and the designer's interpretation of the data. Creating an effective infomage, however, can require a fair amount of design expertise and is thus out of reach for most people. In order to also afford non‐artists with the means to design convincing infomages, we first study the principled design of existing infomages and identify a set of key chart embedding techniques. Informed by these findings we build a design tool that links web‐scale image search with a set of interactive image processing tools to empower novice users with the ability to design a wide variety of infomages. As the embedding process might introduce some amount of visual distortion of the data our tool also aids users to gauge the amount of this distortion, if any. We experimentally demonstrate the usability of our tool and conclude with a discussion of infomages and our design tool.

     
    more » « less
  5. Understanding and reasoning with multidimensional data is a critical skill for students in various disciplines. This study explores how data experts navigate and analyze unfamiliar multidimensional datasets. Through our interviews with nine data experts, we identified three main approaches: (1) manipulating flat tables, (2) creating relational databases, and (3) using computational commands. These findings challenge our initial assumption that making hierarchy would be a common expert data move. Rather than revealing a “typical” strategy, these interviews yielded a range of approaches, with most experts describing more than one approach and how they would decide between them. These insights will inform the design of pedagogical techniques and tools to support students’ reasoning with multidimensional data. 
    more » « less