skip to main content


Title: Auditing News Curation Systems: A Case Study Examining Algorithmic and Editorial Logic in Apple News
This work presents an audit study of Apple News as a sociotechnical news curation system that exercises gatekeeping power in the media. We examine the mechanisms behind Apple News as well as the content presented in the app, outlining the social, political, and economic implications of both aspects. We focus on the Trending Stories section, which is algorithmically curated, and the Top Stories section, which is human-curated. Results from a crowdsourced audit showed minimal content personalization in the Trending Stories section, and a sock-puppet audit showed no location-based content adaptation. Finally, we perform an extended two-month data collection to compare the human-curated Top Stories section with the algorithmically-curated Trending Stories section. Within these two sections, human curation outperformed algorithmic curation in several measures of source diversity, concentration, and evenness. Furthermore, algorithmic curation featured more “soft news” about celebrities and entertainment, while editorial curation featured more news about policy and international events. To our knowledge, this study provides the first data-backed characterization of Apple News in the United States.  more » « less
Award ID(s):
1717330
NSF-PAR ID:
10175595
Author(s) / Creator(s):
Date Published:
Journal Name:
Proceedings of the International AAAI Conference on Weblogs and Social Media
ISSN:
2334-0770
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper presents a crowdsourced auditing framework for news aggregators and applies it to the trending section of Apple News. The framework audits the aggregator algorithm, determining the refresh interval and detecting the presence of "adaptation" (an aggregator presenting different headlines based on a user's location or individual preferences). It is also used for a content audit which tabulates the distribution of news sources found in the aggregator. We deploy this framework on the trending stories section of Apple News, observing (1) a refresh interval of approximately 60 minutes, (2) adaptation at the user level, and (3) a unique distribution of news sources that prompts further investigation. 
    more » « less
  2. This paper presents an algorithm audit of the Google Top Stories box, a prominent component of search engine results and powerful driver of traffic to news publishers. As such, it is important in shaping user attention towards news outlets and topics. By analyzing the number of appearances of news article links we contribute a series of novel analyses that provide an in-depth characterization of news source diversity and its implications for attention via Google search. We present results indicating a considerable degree of source concentration (with variation among search terms), a slight exaggeration in the ideological skew of news in comparison to a baseline, and a quantification of how the presentation of items translates into traffic and attention for publishers. We contribute insights that underscore the power that Google wields in exposing users to diverse news information, and raise important questions and opportunities for future work on algorithmic news curation. 
    more » « less
  3. Espinosa-Anke, Luis ; Martín-Vide, Carlos ; Spasić, Irena (Ed.)
    Algorithmic journalism refers to automatic AI-constructed news stories. There have been successful commercial implementations for news stories in sports, weather, financial reporting and similar domains with highly structured, well defined tabular data sources. Other domains such as local reporting have not seen adoption of algorithmic journalism, and thus no automated reporting systems are available in these categories which can have important implications for the industry. In this paper, we demonstrate a novel approach for producing news stories on government legislative activity, an area that has not widely adopted algorithmic journalism. Our data source is state legislative proceedings, primarily the transcribed speeches and dialogue from floor sessions and committee hearings in US State legislatures. Specifically, we create a library of potential events called phenoms. We systematically analyze the transcripts for the presence of phenoms using a custom partial order planner. Each phenom, if present, contributes some natural language text to the generated article: either stating facts, quoting individuals or summarizing some aspect of the discussion. We evaluate two randomly chosen articles with a user study on Amazon Mechanical Turk with mostly Likert scale questions. Our results indicate a high degree of achievement for accuracy of facts and readability of final content with 13 of 22 users in the first article and 19 of 20 subjects of the second article agreeing or strongly agreeing that the articles included the most important facts of the hearings. Other results strengthen this finding in terms of accuracy, focus and writing quality. 
    more » « less
  4. Choosing the political party nominees, who will appear on the ballot for the US presidency, is a long process that starts two years before the general election. The news media plays a particular role in this process by continuously covering the state of the race. How can this news coverage be characterized? Given that there are thousands of news organizations, but each of us is exposed to only a few of them, we might be missing most of it. Online news aggregators, which aggregate news stories from a multitude of news sources and perspectives, could provide an important lens for the analysis. One such aggregator is Google’s Top stories, a recent addition to Google’s search result page. For the duration of 2019, we have collected the news headlines that Google Top stories has displayed for 30 candidates of both US political parties. Our dataset contains 79,903 news story URLs published by 2,168 unique news sources. Our analysis indicates that despite this large number of news sources, there is a very skewed distribution of where the Top stories are originating, with a very small number of sources contributing the majority of stories. We are sharing our dataset1 so that other researchers can answer questions related to algorithmic curation of news as well as media agenda setting in the context of political elections. 
    more » « less
  5. Smart speakers are becoming ubiquitous in daily life. The widespread and increasing use of smart speakers for news and information in society presents new questions related to the quality, source diversity and credibility, and reliability of algorithmic intermediaries for news consumption. While user adoption rates soar, audit instruments for assessing information quality in smart speakers are lagging. As an initial effort, we present a conceptual framework and data-driven approach for evaluating smart speakers for information quality. We demonstrate the application of our framework on the Amazon Alexa voice assistant and identify key information provenance and source credibility problems as well as systematic differences in the quality of responses about hard and soft news. Our study has broad implications for news media and society, content production, and information quality assessment. 
    more » « less