skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Assessing Candidate Preference through Web Browsing History
Predicting election outcomes is of considerable interest to candidates, political scientists, and the public at large. We propose the use of Web browsing history as a new indicator of candidate preference among the electorate, one that has potential to overcome a number of the drawbacks of election polls. However, there are a number of challenges that must be overcome to effectively use Web browsing for assessing candidate preference—including the lack of suitable ground truth data and the heterogeneity of user populations in time and space. We address these challenges, and show that the resulting methods can shed considerable light on the dynamics of voters’ candidate preferences in ways that are difficult to achieve using polls.  more » « less
Award ID(s):
1703592
PAR ID:
10096153
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
Page Range / eLocation ID:
158 to 167
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We present a real-world deployment of secure multiparty computation to predict political preference from private web browsing data. To estimate aggregate preferences for the 2024 U.S. presidential election candidates, we collect and analyze secret-shared data from nearly 8000 users from August 2024 through February 2025, with over 2000 daily active users sustained throughout the bulk of the survey. The use of MPC allows us to compute over sensitive web browsing data that users would otherwise be more hesitant to provide. We collect data using a custom-built Chrome browser extension and perform our analysis using the CrypTen MPC library. To our knowledge, we provide the first implementation under MPC of a model for the learning from label proportions (LLP) problem in machine learning, which allows us to train on unlabeled web browsing data using publicly available polling and election results as the ground truth. 
    more » « less
  2. Scarano, Stephen; Vasudevan, Vijayalakshmi; Samory, Mattia; Yang, Kai-Cheng; Yang, JungHwan; Grabowicz, Przemyslaw A (Ed.)
    Social media platforms allow users to create polls to gather public opinion on diverse topics. However, we know little about what such polls are used for and how reliable they are, especially in significant contexts like elections. Focusing on the 2020 presidential elections in the U.S., this study shows that outcomes of election polls on Twitter deviate from election results despite their prevalence. Leveraging demographic inference and statistical analysis, we find that Twitter polls are disproportionately authored by male Republicans and exhibit a large bias towards candidate Donald Trump in comparison to mainstream polls. We investigate potential sources of biased outcomes from the point of view of inauthentic, automated, and counter-normative behavior. Using social media experiments and interviews with poll authors, we identify inconsistencies between public vote counts and those privately visible to poll authors, with the gap potentially attributable to purchased votes. We find that election polls tend to be more biased, contain more questionable votes, and attract more bots before the election day than after. We highlight and compare key factors contributing to biased poll outcomes. Finally, we identify instances of polls spreading voter fraud conspiracy theories and estimate that a couple of thousand such polls were posted in 2020. The study discusses the implications of biased election polls in the context of transparency and accountability of social media platforms. 
    more » « less
  3. Polls posted on social media can provide information about public opinion on a variety of issues from business decisions to support for presidential election candidates. However, it is largely unknown whether the information provided by social polls is useful or not. To enhance our understanding of social polls, we examine nearly two thousand Twitter polls gauging support for U.S. presidential candidates during the 2016 and 2020 election campaigns. First, we describe the prevalence of social polls. Second, we characterize social polls in terms of the engagement they elicit and the response options they present. Third, leveraging machine learning models, we infer and describe several characteristics, including demographics and political leanings, of the users who author and interact with social polls. Finally, we study the relationship between social poll results, their attributes, and the characteristics of users interacting with them. Our findings suggest how and to what extent polling on Twitter is biased in terms of content, authorship, and audience. The 2016 and 2020 polls were predominantly crafted by older males and manifested a pronounced bias favoring candidate Donald Trump, whereas traditional surveys favored Democratic candidates. We further identify and explore the potential reasons for such biases and discuss their repercussions. 
    more » « less
  4. Abstract Presidential elections can be forecast using information from political and economic conditions, polls, and a statistical model of changes in public opinion over time. However, these “knowns” about how to make a good presidential election forecast come with many unknowns due to the challenges of evaluating forecast calibration and communication. We highlight how incentives may shape forecasts, and particularly forecast uncertainty, in light of calibration challenges. We illustrate these challenges in creating, communicating, and evaluating election predictions, using the Economist and Fivethirtyeight forecasts of the 2020 election as examples, and offer recommendations for forecasters and scholars. 
    more » « less
  5. null (Ed.)
    A boardroom election is an election with a small number of voters carried out with public communications. We present BVOT, a self-tallying boardroom voting protocol with ballot secrecy, fairness (no tally information is available before the polls close), and dispute-freeness (voters can observe that all voters correctly followed the protocol). BVOT works by using a multiparty threshold homomorphic encryption system in which each candidate is associated with a set of masked primes. Each voter engages in an oblivious transfer with an untrusted distributor: the voter selects the index of a prime associated with a candidate and receives the selected prime in masked form. The voter then casts their vote by encrypting their masked prime and broadcasting it to everyone. The distributor does not learn the voter's choice, and no one learns the mapping between primes and candidates until the audit phase. By hiding the mapping between primes and candidates, BVOT provides voters with insufficient information to carry out effective cheating. The threshold feature prevents anyone from computing any partial tally---until everyone has voted. Multiplying all votes, their decryption shares, and the unmasking factor yields a product of the primes each raised to the number of votes received. In contrast to some existing boardroom voting protocols, BVOT does not rely on any zero-knowledge proof; instead, it uses oblivious transfer to assure ballot secrecy and correct vote casting. Also, BVOT can handle multiple candidates in one election. BVOT prevents cheating by hiding crucial information: an attempt to increase the tally of one candidate might increase the tally of another candidate. After all votes are cast, any party can tally the votes. 
    more » « less