skip to main content


Title: Opening Up the Black Box: Auditing Google’s Top Stories Algorithm
Algorithmic auditing has emerged as an important methodology that gleans insights from opaque platform algorithms. These audits often rely on the repeated observations of an algorithm’s outputs given a fixed set of inputs. For example, to audit Google search, one repeatedly inputs queries and captures the resulting search pages. Then, the goal is to uncover patterns in the data that reveal the “secrets” of algorithmic decision making. In this paper, we introduce one particular algorithm audit, that of Google’s Top stories. We describe the process of data collection, exploration, and analysis for this application and share some of the insights. Concretely, our analysis suggests that Google may be trying to burst the “filter bubble” by choosing less known publishers for the 3rd position in the Top stories. In addition to revealing the behavior of the platform, the audit also illustrated that a subset of publishers cover certain stories more than others.  more » « less
Award ID(s):
1751087
NSF-PAR ID:
10101277
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of the ... International Florida Artificial Intelligence Research Society Conference
Volume:
32
ISSN:
2334-0754
Page Range / eLocation ID:
376-382
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper presents an algorithm audit of the Google Top Stories box, a prominent component of search engine results and powerful driver of traffic to news publishers. As such, it is important in shaping user attention towards news outlets and topics. By analyzing the number of appearances of news article links we contribute a series of novel analyses that provide an in-depth characterization of news source diversity and its implications for attention via Google search. We present results indicating a considerable degree of source concentration (with variation among search terms), a slight exaggeration in the ideological skew of news in comparison to a baseline, and a quantification of how the presentation of items translates into traffic and attention for publishers. We contribute insights that underscore the power that Google wields in exposing users to diverse news information, and raise important questions and opportunities for future work on algorithmic news curation. 
    more » « less
  2. When one searches for political candidates on Google, a panel composed of recent news stories, known as Top stories, is commonly shown at the top of the search results page. These stories are selected by an algorithm that chooses from hundreds of thousands of articles published by thousands of news publishers. In our previous work, we identified 56 news sources that contributed 2/3 of all Top stories for 30 political candidates running in the primaries of 2020 US Presidential Election. In this paper, we survey US voters to elicit their familiarity and trust with these 56 news outlets. We find that some of the most frequent outlets are not familiar to all voters (e.g. The Hill or Politico), or particularly trusted by voters of any political stripes (e.g. Washington Examiner or The Daily Beast). Why then, are such sources shown so frequently in Top stories? We theorize that Google is sampling news articles from sources with different political leanings to offer a balanced coverage. This is reminiscent of the so-called “fairness doctrine” (1949-1987) policy in the United States that required broadcasters (radio or TV stations) to air contrasting views about controversial matters. Because there are fewer right-leaning publications than center or left-leaning ones, in order to maintain this “fair” balance, hyper-partisan far-right news sources of low trust receive more visibility than some news sources that are more familiar to and trusted by the public. 
    more » « less
  3. Choosing the political party nominees, who will appear on the ballot for the US presidency, is a long process that starts two years before the general election. The news media plays a particular role in this process by continuously covering the state of the race. How can this news coverage be characterized? Given that there are thousands of news organizations, but each of us is exposed to only a few of them, we might be missing most of it. Online news aggregators, which aggregate news stories from a multitude of news sources and perspectives, could provide an important lens for the analysis. One such aggregator is Google’s Top stories, a recent addition to Google’s search result page. For the duration of 2019, we have collected the news headlines that Google Top stories has displayed for 30 candidates of both US political parties. Our dataset contains 79,903 news story URLs published by 2,168 unique news sources. Our analysis indicates that despite this large number of news sources, there is a very skewed distribution of where the Top stories are originating, with a very small number of sources contributing the majority of stories. We are sharing our dataset1 so that other researchers can answer questions related to algorithmic curation of news as well as media agenda setting in the context of political elections. 
    more » « less
  4. This work presents an audit study of Apple News as a sociotechnical news curation system that exercises gatekeeping power in the media. We examine the mechanisms behind Apple News as well as the content presented in the app, outlining the social, political, and economic implications of both aspects. We focus on the Trending Stories section, which is algorithmically curated, and the Top Stories section, which is human-curated. Results from a crowdsourced audit showed minimal content personalization in the Trending Stories section, and a sock-puppet audit showed no location-based content adaptation. Finally, we perform an extended two-month data collection to compare the human-curated Top Stories section with the algorithmically-curated Trending Stories section. Within these two sections, human curation outperformed algorithmic curation in several measures of source diversity, concentration, and evenness. Furthermore, algorithmic curation featured more “soft news” about celebrities and entertainment, while editorial curation featured more news about policy and international events. To our knowledge, this study provides the first data-backed characterization of Apple News in the United States. 
    more » « less
  5. null (Ed.)
    Even though a restaurant may receive different ratings across review platforms, people often see only one rating during a local search (e.g. 'best burgers near me'). In this paper, we examine the differences in ratings between two commonly used review platforms-Google Maps and Yelp. We found that restaurant ratings on Google Maps are, on average, 0.7 stars higher than those on Yelp, with the increase being driven in large part by higher ratings for chain restaurants on Google Maps. We also found extensive diversity in top-ranked restaurants by geographic region across platforms. For example, for a given metropolitan area, there exists little overlap in its top ten lists of restaurants on Google Maps and Yelp. Our results problematize the use of a single review platform in local search and have implications for end users of ratings and local search technologies. We outline concrete design recommendations to improve communication of restaurant evaluation and discuss the potential causes for the divergence we observed. 
    more » « less