skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on December 4, 2026

Title: Privacy-Preserving Machine Learning on Web Browsing for Public Opinion
We present a real-world deployment of secure multiparty computation to predict political preference from private web browsing data. To estimate aggregate preferences for the 2024 U.S. presidential election candidates, we collect and analyze secret-shared data from nearly 8000 users from August 2024 through February 2025, with over 2000 daily active users sustained throughout the bulk of the survey. The use of MPC allows us to compute over sensitive web browsing data that users would otherwise be more hesitant to provide. We collect data using a custom-built Chrome browser extension and perform our analysis using the CrypTen MPC library. To our knowledge, we provide the first implementation under MPC of a model for the learning from label proportions (LLP) problem in machine learning, which allows us to train on unlabeled web browsing data using publicly available polling and election results as the ground truth.  more » « less
Award ID(s):
2209194
PAR ID:
10633151
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
Springer
Date Published:
Format(s):
Medium: X
Location:
International Symposium on Cyber Security, Cryptology and Machine Learning (CSCML) 2025
Sponsoring Org:
National Science Foundation
More Like this
  1. Predicting election outcomes is of considerable interest to candidates, political scientists, and the public at large. We propose the use of Web browsing history as a new indicator of candidate preference among the electorate, one that has potential to overcome a number of the drawbacks of election polls. However, there are a number of challenges that must be overcome to effectively use Web browsing for assessing candidate preference—including the lack of suitable ground truth data and the heterogeneity of user populations in time and space. We address these challenges, and show that the resulting methods can shed considerable light on the dynamics of voters’ candidate preferences in ways that are difficult to achieve using polls. 
    more » « less
  2. Secure multi-party computation (MPC) allows multiple parties to jointly compute the output of a function while preserving the privacy of any individual party's inputs to that function. As MPC protocols transition from research prototypes to real-world applications, the usability of MPC-enabled applications is increasingly critical to their successful deployment and wide adoption. Our Web-MPC platform, designed with a focus on usability, has been deployed for privacy-preserving data aggregation initiatives with the City of Boston and the Greater Boston Chamber of Commerce. After building and deploying an initial version of this platform, we conducted a heuristic evaluation to identify additional usability improvements and implemented corresponding application enhancements. However, it is difficult to gauge the effectiveness of these changes within the context of real-world deployments using traditional web analytics tools without compromising the security guarantees of the platform. This work consists of two contributions that address this challenge: (1) the Web-MPC platform has been extended with the capability to collect web analytics using existing MPC protocols, and (2) this capability has been leveraged to conduct a usability study comparing the two version of Web-MPC (before and after the heuristic evaluation and associated improvements). While many efforts have focused on ways to enhance the usability of privacy-preserving technologies, this study can serve as a model for using a privacy-preserving data-driven approach in evaluating or enhancing the usability of privacy-preserving websites and applications deployed in real-world scenarios. The data collected in this study yields insights about the interplay between usability and security that can help inform future implementations of applications that employ MPC. 
    more » « less
  3. An essential component of initiatives that aim to address pervasive inequalities of any kind is the ability to collect empirical evidence of both the status quo baseline and of any improvement that can be attributed to prescribed and deployed interventions. Unfortunately, two substantial barriers can arise preventing the collection and analysis of such empirical evidence: (1) the sensitive nature of the data itself and (2) a lack of technical sophistication and infrastructure available to both an initiative’s beneficiaries and to those spearheading it. In the last few years, it has been shown that a cryptographic primitive called secure multi-party computation (MPC) can provide a natural technological resolution to this conundrum. MPC allows an otherwise disinterested third party to contribute its technical expertise and resources, to avoid incurring any additional liabilities itself, and (counterintuitively) to reduce the level of data exposure that existing parties must accept to achieve their data analysis goals. However, achieving these benefits requires the deliberate design of MPC tools and frameworks whose level of accessibility to non-technical users with limited infrastructure and expertise is state-of-the-art. We describe our own experiences designing, implementing, and deploying such usable web applications for secure data analysis within the context of two real-world initiatives that focus on promoting economic equality. 
    more » « less
  4. Navigating webpages with screen readers is a challenge even with recent improvements in screen reader technologies and the increased adoption of web standards for accessibility, namely ARIA. ARIA landmarks, an important aspect of ARIA, lets screen reader users access different sections of the webpage quickly, by enabling them to skip over blocks of irrelevant or redundant content. However, these landmarks are sporadically and inconsistently used by web developers, and in many cases, even absent in numerous web pages. Therefore,we propose SaIL, a scalable approach that automatically detects the important sections of a web page, and then injects ARIA landmarks into the corresponding HTML markup to facilitate quick access to these sections. The central concept underlying SaIL is visual saliency, which is determined using a state-of-the-art deep learning model that was trained on gaze-tracking data collected from sighted users in the context of web browsing. We present the findings of a pilot study that demonstrated the potential of SaIL in reducing both the time and effort spent in navigating webpages with screen readers. 
    more » « less
  5. The surge in political information, discourse, and interaction has been one of the most important developments in social media over the past several years. There is rich structure in the interaction among different viewpoints on the ideological spectrum. However, we still have only a limited analytical vocabulary for expressing the ways in which these viewpoints interact. In this paper, we develop network-based methods that operate on the ways in which users share content; we construct invocation graphs on Web domains showing the extent to which pages from one domain are invoked by users to reply to posts containing pages from other domains. When we locate the domains on a political spectrum induced from the data, we obtain an embedded graph showing how these interaction links span different distances on the spectrum. The structure of this embedded network, and its evolution over time, helps us derive macro-level insights about how political interaction unfolded through 2016, leading up to the US Presidential election. In particular, we find that the domains invoked in replies spanned increasing distances on the spectrum over the months approaching the election, and that there was clear asymmetry between the left-to-right and right-to-left patterns of linkage. 
    more » « less