skip to main content

Title: The Case for Voter-Centered Audits of Search Engines During Political Elections
Search engines, by ranking a few links ahead of million others based on opaque rules, open themselves up to criticism of bias. Previous research has focused on measuring political bias of search engine algorithms to detect possible search engine manipulation effects on voters or unbalanced ideological representation in search results. Insofar that these concerns are related to the principle of fairness, this notion of fairness can be seen as explicitly oriented toward election candidates or political processes and only implicitly oriented toward the public at large. Thus, we ask the following research question: how should an auditing framework that is explicitly centered on the principle of ensuring and maximizing fairness for the public (i.e., voters) operate? To answer this question, we qualitatively explore four datasets about elections and politics in the United States: 1) a survey of eligible U.S. voters about their information needs ahead of the 2018 U.S. elections, 2) a dataset of biased political phrases used in a large-scale Google audit ahead of the 2018 U.S. election, 3) Google’s “related searches” phrases for two groups of political candidates in the 2018 U.S. election (one group is composed entirely of women), and 4) autocomplete suggestions and result pages for a set of searches on the day of a statewide election in the U.S. state of Virginia in 2019. We find that voters have much broader information needs than the search engine audit literature has accounted for in the past, and that relying on political science theories of voter modeling provides a good starting point for informing the design of voter-centered audits.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
The 3rd ACM Conference on Fairness, Accountability, and Transparency (FAT* 2020)
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. With each successive election since at least 1994, congressional elections in the United States have transitioned toward nationalized two-party government. Fewer voters split their tickets for different parties between President and Congress. Regional blocs and incumbency voting --- a key feature of U.S. elections in the latter 20th century --- appear to have given way to strong party discipline among candidates and nationalized partisanship among voters. Observers of modern American politics are therefore tempted to write off the importance of the swing voter, defined here as voters who are indifferent between the two parties and thus likely to split their ticket or switch their party support. By assembling data from historical elections (1950 -- 2020), surveys (2008 -- 2018), and cast vote record data (2010 -- 2018), and through developing statistical methods to analyze such data, I argue that although they comprise a smaller portion of the electorate, each swing voter is disproportionately decisive in modern American politics, a phenomenon I call the swing voter paradox. Historical comparisons across Congressional, state executive, and state legislative elections confirm the decline in aggregate measures of ticket splitting suggested in past work. But the same indicator has not declined nearly as much in county legislative or county sheriff elections (Chapter 1). Ticket splitters and party switchers tend to be voters with low news interest and ideological moderate. Consistent with a spatial voting model with valence, voters also become ticket splitters when incumbents run (Chapter 2). I then provide one of the first direct measures of ticket splitting instate and local office using cast vote records. I find that ticket splitting is more prevalent in state and local elections (Chapter 3). This is surprising given the conventional wisdom that party labels serve as heuristics and down-ballot elections are low information environments. A major barrier for existing studies of the swing voter lies in the measurement from incomplete electoral data. Traditional methods struggle to extract information about subgroups from large surveys or cast vote records, because of small subgroup samples, multi-dimensional data, and systematic missingness. I therefore develop a procedure for reweighting surveys to small areas through expanding poststratification targets (Chapter 4), and a clustering algorithm for survey or ballot data with multiple offices to extract interpretable voting blocs (Chapter 5). I provide open-source software to implement both methods. These findings challenge a common characterization of modern American politics as one dominated by rigidly polarized parties and partisans. The picture that emerges instead is one where swing voters are rare but can dramatically decide the party in power, and where no single demographic group is a swing voter. Instead of entrenching elections into red states and blue states, nationalization may heighten the role of the persuadable voter. 
    more » « less
  2. This research shows that members of different ideological groups in the United States can use different search terms when looking for information about political candidates, but that difference is not enough to yield divergent search results on Google. Search engines are central in information seeking during elections, and have important implications for the distribution of information and, by extension, for democratic society. Using a method involving surveys, qualitative coding, and quantitative analysis of search terms and search results, we show that the sources of information that are returned by Google for both liberal and conservative search terms are strongly correlated. We collected search terms from people with different ideological positions about Senate candidates in the 2018 midterm election from the two main parties in the U.S., in three large and politically distinct states: California, Ohio, and Texas. We then used those search terms to scrape web results and analyze them. Our analysis shows that, in terms of the differences arising from individual search term choices, Google results exhibit a mainstreaming effect that partially neutralizes differentiation of search behaviors, by providing a set of common results, even to dissimilar searches. Based on this analysis, this article offers two main contributions: first, in the development of a method for determining group-level differences based on search input bias; and second, in demonstrating how search engines respond to diverse information seeking behavior and whether that may have implications for public discourse. 
    more » « less
  3. The prevalence and spread of online misinformation during the 2020 US presidential election served to perpetuate a false belief in widespread election fraud. Though much research has focused on how social media platforms connected people to election-related rumors and conspiracy theories, less is known about the search engine pathways that linked users to news content with the potential to undermine trust in elections. In this paper, we present novel data related to the content of political headlines during the 2020 US election period. We scraped over 800,000 headlines from Google's search engine results pages (SERP) in response to 20 election-related keywords—10 general (e.g., "Ballots") and 10 conspiratorial (e.g., "Voter fraud")—when searched from 20 cities across 16 states. We present results from qualitative coding of 5,600 headlines focused on the prevalence of delegitimizing information. Our results reveal that videos (as compared to stories, search results, and advertisements) are the most problematic in terms of exposing users to delegitimizing headlines. We also illustrate how headline content varies when searching from a swing state, adopting a conspiratorial search keyword, or reading from media domains with higher political bias. We conclude with policy recommendations on data transparency that allow researchers to continue to monitor search engines during elections. 
    more » « less
  4. null (Ed.)
    A boardroom election is an election with a small number of voters carried out with public communications. We present BVOT, a self-tallying boardroom voting protocol with ballot secrecy, fairness (no tally information is available before the polls close), and dispute-freeness (voters can observe that all voters correctly followed the protocol). BVOT works by using a multiparty threshold homomorphic encryption system in which each candidate is associated with a set of masked primes. Each voter engages in an oblivious transfer with an untrusted distributor: the voter selects the index of a prime associated with a candidate and receives the selected prime in masked form. The voter then casts their vote by encrypting their masked prime and broadcasting it to everyone. The distributor does not learn the voter's choice, and no one learns the mapping between primes and candidates until the audit phase. By hiding the mapping between primes and candidates, BVOT provides voters with insufficient information to carry out effective cheating. The threshold feature prevents anyone from computing any partial tally---until everyone has voted. Multiplying all votes, their decryption shares, and the unmasking factor yields a product of the primes each raised to the number of votes received. In contrast to some existing boardroom voting protocols, BVOT does not rely on any zero-knowledge proof; instead, it uses oblivious transfer to assure ballot secrecy and correct vote casting. Also, BVOT can handle multiple candidates in one election. BVOT prevents cheating by hiding crucial information: an attempt to increase the tally of one candidate might increase the tally of another candidate. After all votes are cast, any party can tally the votes. 
    more » « less
  5. When one searches for political candidates on Google, a panel composed of recent news stories, known as Top stories, is commonly shown at the top of the search results page. These stories are selected by an algorithm that chooses from hundreds of thousands of articles published by thousands of news publishers. In our previous work, we identified 56 news sources that contributed 2/3 of all Top stories for 30 political candidates running in the primaries of 2020 US Presidential Election. In this paper, we survey US voters to elicit their familiarity and trust with these 56 news outlets. We find that some of the most frequent outlets are not familiar to all voters (e.g. The Hill or Politico), or particularly trusted by voters of any political stripes (e.g. Washington Examiner or The Daily Beast). Why then, are such sources shown so frequently in Top stories? We theorize that Google is sampling news articles from sources with different political leanings to offer a balanced coverage. This is reminiscent of the so-called “fairness doctrine” (1949-1987) policy in the United States that required broadcasters (radio or TV stations) to air contrasting views about controversial matters. Because there are fewer right-leaning publications than center or left-leaning ones, in order to maintain this “fair” balance, hyper-partisan far-right news sources of low trust receive more visibility than some news sources that are more familiar to and trusted by the public. 
    more » « less