Search engines, by ranking a few links ahead of million others based on opaque rules, open themselves up to criticism of bias. Previous research has focused on measuring political bias of search engine algorithms to detect possible search engine manipulation effects on voters or unbalanced ideological representation in search results. Insofar that these concerns are related to the principle of fairness, this notion of fairness can be seen as explicitly oriented toward election candidates or political processes and only implicitly oriented toward the public at large. Thus, we ask the following research question: how should an auditing framework that is explicitly centered on the principle of ensuring and maximizing fairness for the public (i.e., voters) operate? To answer this question, we qualitatively explore four datasets about elections and politics in the United States: 1) a survey of eligible U.S. voters about their information needs ahead of the 2018 U.S. elections, 2) a dataset of biased political phrases used in a large-scale Google audit ahead of the 2018 U.S. election, 3) Google’s “related searches” phrases for two groups of political candidates in the 2018 U.S. election (one group is composed entirely of women), and 4) autocomplete suggestions and result pages for a set of searches on the day of a statewide election in the U.S. state of Virginia in 2019. We find that voters have much broader information needs than the search engine audit literature has accounted for in the past, and that relying on political science theories of voter modeling provides a good starting point for informing the design of voter-centered audits.
more »
« less
Partisan search behavior and Google results in the 2018 U.S. midterm elections
This research shows that members of different ideological groups in the United States can use different search terms when looking for information about political candidates, but that difference is not enough to yield divergent search results on Google. Search engines are central in information seeking during elections, and have important implications for the distribution of information and, by extension, for democratic society. Using a method involving surveys, qualitative coding, and quantitative analysis of search terms and search results, we show that the sources of information that are returned by Google for both liberal and conservative search terms are strongly correlated. We collected search terms from people with different ideological positions about Senate candidates in the 2018 midterm election from the two main parties in the U.S., in three large and politically distinct states: California, Ohio, and Texas. We then used those search terms to scrape web results and analyze them. Our analysis shows that, in terms of the differences arising from individual search term choices, Google results exhibit a mainstreaming effect that partially neutralizes differentiation of search behaviors, by providing a set of common results, even to dissimilar searches. Based on this analysis, this article offers two main contributions: first, in the development of a method for determining group-level differences based on search input bias; and second, in demonstrating how search engines respond to diverse information seeking behavior and whether that may have implications for public discourse.
more »
« less
- Award ID(s):
- 1717330
- PAR ID:
- 10175592
- Date Published:
- Journal Name:
- Information, Communication & Society
- ISSN:
- 1369-118X
- Page Range / eLocation ID:
- 1 to 17
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This paper presents an algorithm audit of the Google Top Stories box, a prominent component of search engine results and powerful driver of traffic to news publishers. As such, it is important in shaping user attention towards news outlets and topics. By analyzing the number of appearances of news article links we contribute a series of novel analyses that provide an in-depth characterization of news source diversity and its implications for attention via Google search. We present results indicating a considerable degree of source concentration (with variation among search terms), a slight exaggeration in the ideological skew of news in comparison to a baseline, and a quantification of how the presentation of items translates into traffic and attention for publishers. We contribute insights that underscore the power that Google wields in exposing users to diverse news information, and raise important questions and opportunities for future work on algorithmic news curation.more » « less
-
In September 2019, Hurricane Dorian struck the Bahamas and the southeast United States, resulting in widespread damage and loss of life. Drawing from previous crisis communication research on both natural and man-made disasters, this study examines information seeking and medium preferences, attention allocation, and sex differences in these outcomes. Extant literature has found differences between men and women in terms of the volume and types of information wanted during a crisis event, as well as preferences for different media in times of crisis. This literature has yet to examine the degree to which attention allocation may be related to these outcomes. To address these issues in a naturalistic context, a large-scale survey was targeted at residents of states impacted by Hurricane Dorian. Results are consistent with previous research indicating that females engaged in more overall information seeking and sought more information seeking related to tangible goals. Females found interactive media (Internet and social media) to be more useful than males. Evidence was not detected concerning sex differences in the way people found out about the storm and sex differences in attention allocation detected. Results suggested small effects for perceived usefulness of television and Internet on attention allocation for both men and women. Implications for emergency management personnel and public officials are discussed.more » « less
-
The abundance of media options is a central feature of today’s information environment. Many accounts, often based on analysis of desktop-only news use, suggest that this increased choice leads to audience fragmentation, ideological segregation, and echo chambers with no cross-cutting exposure. Contrary to many of those claims, this paper uses observational multiplatform data capturing both desktop and mobile use to demonstrate that coexposure to diverse news is on the rise, and that ideological self-selection does not explain most of that coexposure. We show that mainstream media outlets offer the common ground where ideologically diverse audiences converge online, though our analysis also reveals that more than half of the US online population consumes no online news, underlining the risk of increased information inequality driven by self-selection along lines of interest. For this study, we use an unprecedented combination of observed data from the United States comprising a 5-y time window and involving tens of thousands of panelists. Our dataset traces news consumption across different devices and unveils important differences in news diets when multiplatform or desktop-only access is used. We discuss the implications of our findings for how we think about the current communication environment, exposure to news, and ongoing attempts to limit the effects of misinformation.more » « less
-
How do Google Search results change following an impactful real-world event, such as the U.S. Supreme Court decision on June 24, 2022 to overturn Roe v. Wade? And what do they tell us about the nature of event-driven content, generated by various participants in the online information environment? In this paper, we present a dataset of more than 1.74 million Google Search results pages collected between June 24 and July 17, 2022, intended to capture what Google Search surfaced in response to queries about this event of national importance. These search pages were collected for 65 locations in 13 U.S. states, a mix of red, blue, and purple states, with respect to their voting patterns. We describe the process of building a set of circa 1,700 phrases used for searching Google, how we gathered the search results for each location, and how these results were parsed to extract information about the most frequently encountered web domains. We believe that this dataset, which comprises raw data (search results as HTML files) and processed data (extracted links organized as CSV files) can be used to answer research questions that are of interest to computational social scientists as well as communication and media studies scholars.more » « less