skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Algorithmic Misjudgement in Google Search Results: Evidence from Auditing the US Online Electoral Information Environment
Google Search is an important way that people seek information about politics [8], and Google states that it is “committed to providing timely and authoritative information on Google Search to help voters understand, navigate, and participate in democratic processes.”1 This paper studies the extent to which government-maintained web domains are represented in the online electoral information environment, as captured through 3.45 Google Search result pages collected during the 2022 US midterm elections for 786 locations across the United States. Focusing on state, county, and local government domains that provide locality-specific information, we study not only the extent to which these sources appear in organic search results, but also the extent to which these sources are correctly targeted to their respective constituents. We label misalignment between the geographic area that non-federal domains serve and the locations for which they appear in search results as algorithmic mistargeting, a subtype of algorithmic misjudgement in which the search algorithm targets locality-specific information to users in different (incorrect) locations. In the context of the 2022 US midterm elections, we find that 71% of all occurrences of state, county, and local government sources were mistargeted, with some domains appearing disproportionately often among organic results despite providing locality-specific information that may not be relevant to all voters. However, we also find that mistargeting often occurs in low ranks. We conclude by considering the potential consequences of extensive mistargeting of non-federal government sources and argue that ensuring the correct targeting of these sources to their respective constituents is a critical part of Google’s role in facilitating access to authoritative and locally-relevant electoral information.  more » « less
Award ID(s):
1751087
PAR ID:
10575980
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
ACM
Date Published:
ISBN:
9798400704505
Page Range / eLocation ID:
433 to 443
Format(s):
Medium: X
Location:
Rio de Janeiro Brazil
Sponsoring Org:
National Science Foundation
More Like this
  1. With each successive election since at least 1994, congressional elections in the United States have transitioned toward nationalized two-party government. Fewer voters split their tickets for different parties between President and Congress. Regional blocs and incumbency voting --- a key feature of U.S. elections in the latter 20th century --- appear to have given way to strong party discipline among candidates and nationalized partisanship among voters. Observers of modern American politics are therefore tempted to write off the importance of the swing voter, defined here as voters who are indifferent between the two parties and thus likely to split their ticket or switch their party support. By assembling data from historical elections (1950 -- 2020), surveys (2008 -- 2018), and cast vote record data (2010 -- 2018), and through developing statistical methods to analyze such data, I argue that although they comprise a smaller portion of the electorate, each swing voter is disproportionately decisive in modern American politics, a phenomenon I call the swing voter paradox. Historical comparisons across Congressional, state executive, and state legislative elections confirm the decline in aggregate measures of ticket splitting suggested in past work. But the same indicator has not declined nearly as much in county legislative or county sheriff elections (Chapter 1). Ticket splitters and party switchers tend to be voters with low news interest and ideological moderate. Consistent with a spatial voting model with valence, voters also become ticket splitters when incumbents run (Chapter 2). I then provide one of the first direct measures of ticket splitting instate and local office using cast vote records. I find that ticket splitting is more prevalent in state and local elections (Chapter 3). This is surprising given the conventional wisdom that party labels serve as heuristics and down-ballot elections are low information environments. A major barrier for existing studies of the swing voter lies in the measurement from incomplete electoral data. Traditional methods struggle to extract information about subgroups from large surveys or cast vote records, because of small subgroup samples, multi-dimensional data, and systematic missingness. I therefore develop a procedure for reweighting surveys to small areas through expanding poststratification targets (Chapter 4), and a clustering algorithm for survey or ballot data with multiple offices to extract interpretable voting blocs (Chapter 5). I provide open-source software to implement both methods. These findings challenge a common characterization of modern American politics as one dominated by rigidly polarized parties and partisans. The picture that emerges instead is one where swing voters are rare but can dramatically decide the party in power, and where no single demographic group is a swing voter. Instead of entrenching elections into red states and blue states, nationalization may heighten the role of the persuadable voter. 
    more » « less
  2. This research shows that members of different ideological groups in the United States can use different search terms when looking for information about political candidates, but that difference is not enough to yield divergent search results on Google. Search engines are central in information seeking during elections, and have important implications for the distribution of information and, by extension, for democratic society. Using a method involving surveys, qualitative coding, and quantitative analysis of search terms and search results, we show that the sources of information that are returned by Google for both liberal and conservative search terms are strongly correlated. We collected search terms from people with different ideological positions about Senate candidates in the 2018 midterm election from the two main parties in the U.S., in three large and politically distinct states: California, Ohio, and Texas. We then used those search terms to scrape web results and analyze them. Our analysis shows that, in terms of the differences arising from individual search term choices, Google results exhibit a mainstreaming effect that partially neutralizes differentiation of search behaviors, by providing a set of common results, even to dissimilar searches. Based on this analysis, this article offers two main contributions: first, in the development of a method for determining group-level differences based on search input bias; and second, in demonstrating how search engines respond to diverse information seeking behavior and whether that may have implications for public discourse. 
    more » « less
  3. Search engines, by ranking a few links ahead of million others based on opaque rules, open themselves up to criticism of bias. Previous research has focused on measuring political bias of search engine algorithms to detect possible search engine manipulation effects on voters or unbalanced ideological representation in search results. Insofar that these concerns are related to the principle of fairness, this notion of fairness can be seen as explicitly oriented toward election candidates or political processes and only implicitly oriented toward the public at large. Thus, we ask the following research question: how should an auditing framework that is explicitly centered on the principle of ensuring and maximizing fairness for the public (i.e., voters) operate? To answer this question, we qualitatively explore four datasets about elections and politics in the United States: 1) a survey of eligible U.S. voters about their information needs ahead of the 2018 U.S. elections, 2) a dataset of biased political phrases used in a large-scale Google audit ahead of the 2018 U.S. election, 3) Google’s “related searches” phrases for two groups of political candidates in the 2018 U.S. election (one group is composed entirely of women), and 4) autocomplete suggestions and result pages for a set of searches on the day of a statewide election in the U.S. state of Virginia in 2019. We find that voters have much broader information needs than the search engine audit literature has accounted for in the past, and that relying on political science theories of voter modeling provides a good starting point for informing the design of voter-centered audits. 
    more » « less
  4. How do Google Search results change following an impactful real-world event, such as the U.S. Supreme Court decision on June 24, 2022 to overturn Roe v. Wade? And what do they tell us about the nature of event-driven content, generated by various participants in the online information environment? In this paper, we present a dataset of more than 1.74 million Google Search results pages collected between June 24 and July 17, 2022, intended to capture what Google Search surfaced in response to queries about this event of national importance. These search pages were collected for 65 locations in 13 U.S. states, a mix of red, blue, and purple states, with respect to their voting patterns. We describe the process of building a set of circa 1,700 phrases used for searching Google, how we gathered the search results for each location, and how these results were parsed to extract information about the most frequently encountered web domains. We believe that this dataset, which comprises raw data (search results as HTML files) and processed data (extracted links organized as CSV files) can be used to answer research questions that are of interest to computational social scientists as well as communication and media studies scholars. 
    more » « less
  5. We provide mechanisms and new metric distortion bounds for line-up elections. In such elections, a set of n voters, k candidates, and ell positions are all located in a metric space. The goal is to choose a set of candidates and assign them to different positions, so as to minimize the total cost of the voters. The cost of each voter consists of the distances from itself to the chosen candidates (measuring how much the voter likes the chosen candidates, or how similar it is to them), as well as the distances from the candidates to the positions they are assigned to (measuring the fitness of the candidates for their positions). Our mechanisms, however, do not know the exact distances, and instead produce good outcomes while only using a smaller amount of information, resulting in small distortion.We consider several different types of information: ordinal voter preferences, ordinal position preferences, and knowing the exact locations of candidates and positions, but not those of voters. In each of these cases, we provide constant distortion bounds, thus showing that only a small amount of information is enough to form outcomes close to optimum in line-up elections. 
    more » « less