skip to main content


Search for: All records

Award ID contains: 1845460

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The scale of scientific publishing continues to grow, creating overload on science journalists who are inundated with choices for what would be most interesting, important, and newsworthy to cover in their reporting. Our work addresses this problem by considering the viability of creating a predictive model of newsworthiness of scientific articles that is trained using crowdsourced evaluations of newsworthiness. We proceed by first evaluating the potential of crowd-sourced evaluations of newsworthiness by assessing their alignment with expert ratings of newsworthiness, analyzing both quantitative correlations and qualitative rating rationale to understand limitations. We then demonstrate and evaluate a predictive model trained on these crowd ratings together with arXiv article metadata, text, and other computed features. Based on the crowdsourcing protocol we developed, we find that while crowdsourced ratings of newsworthiness often align moderately with expert ratings, there are also notable differences and divergences which limit the approach. Yet despite these limitations we also find that the predictive model we built provides a reasonably precise set of rankings when validated against expert evaluations (P@10 = 0.8, P@15 = 0.67), suggesting that a viable signal can be learned from crowdsourced evaluations of newsworthiness. Based on these findings we discuss opportunities for future work to leverage crowdsourcing and predictive approaches to support journalistic work in discovering and filtering newsworthy information. 
    more » « less
  2. Government use of algorithmic decision-making (ADM) systems is widespread and diverse, and holding these increasingly high-impact, often opaque government algorithms accountable presents a number of challenges. Some European governments have launched registries of ADM systems used in public services, and some transparency initiatives exist for algorithms in specific areas of the United States government; however, the U.S. lacks an overarching registry that catalogs algorithms in use for public-service delivery throughout the government. This paper conducts an inductive thematic analysis of over 700 government ADM systems cataloged by the Algorithm Tips database in an effort to describe the various ways government algorithms might be understood and inform downstream uses of such an algorithmic catalog. We describe the challenge of government algorithm accountability, the Algorithm Tips database and method for conducting a thematic analysis, and the themes of topics and issues, levels of sophistication, interfaces, and utilities of U.S. government algorithms that emerge. Through these themes, we contribute several different descriptions of government algorithm use across the U.S. and at federal, state, and local levels which can inform stakeholders such as journalists, members of civil society, or government policymakers 
    more » « less
  3. Many journalists and newsrooms now incorporate audience contributions in their sourcing practices by leveraging user-generated content (UGC). However, their sourcing needs and practices as they seek information from UGCs are still not deeply understood by researchers or well-supported in tools. This paper first reports the results of a qualitative interview study with nine professional journalists about their UGC sourcing practices, detailing what journalists typically look for in UGCs and elaborating on two UGC sourcing approaches: deep reporting and wide reporting. These findings then inform a human-centered design approach to prototype a UGC sourcing tool for journalists, which enables journalists to interactively filter and rank UGCs based on users’ example content. We evaluate the prototype with nine professional journalists who source UGCs in their daily routines to understand how UGC sourcing practices are enabled and transformed, while also uncovering opportunities for future research and design to support journalistic sourcing practices and sensemaking processes. 
    more » « less
  4. null (Ed.)
    Computational news discovery refers to the use of algorithms to orient editorial attention to potentially newsworthy events or information prior to publication. In this paper we describe the design, development, and initial evaluation of a computational news discovery tool, called Lead Locator, which is geared towards supplementing national politics reporting by suggesting potentially interesting locations to report on. Based on massive amounts of data from a national voter file, Lead Locator ranks counties based on statistical properties such as their extremity in the distribution of a variable of interest (e.g. voter turnout) as well as their political relevance in terms of shifts in voting patterns. It then presents an automatically generated tip sheet of potentially interesting locations that reporters can interactively browse and search to help inform their reporting ideas. 
    more » « less