NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Privacy-Preserving Machine Learning on Web Browsing for Public Opinion

Buxbaum, Sam; Tassis, Lucas M; Boschelli, Lucas; Comarela, Giovanni; Varia, Mayank; Crovella, Mark; Christenson, Dino P (December 2025, Springer)

We present a real-world deployment of secure multiparty computation to predict political preference from private web browsing data. To estimate aggregate preferences for the 2024 U.S. presidential election candidates, we collect and analyze secret-shared data from nearly 8000 users from August 2024 through February 2025, with over 2000 daily active users sustained throughout the bulk of the survey. The use of MPC allows us to compute over sensitive web browsing data that users would otherwise be more hesitant to provide. We collect data using a custom-built Chrome browser extension and perform our analysis using the CrypTen MPC library. To our knowledge, we provide the first implementation under MPC of a model for the learning from label proportions (LLP) problem in machine learning, which allows us to train on unlabeled web browsing data using publicly available polling and election results as the ground truth.
more » « less
Free, publicly-accessible full text available December 4, 2026
SysteMPC’25: 1st Workshop on Systems for Secure Multi-Party Computation

Liagouris, John; Kalavri, Vasiliki; Varia, Mayank (September 2025, https://hdl.handle.net/2144/51164)

The SysteMPC workshop brings together cryptographers and systems researchers to discuss advances in overcom- ing practical challenges of using MPC in the wild. Topics of interest include cryptography and systems co-design, pro- gramming abstractions, modularity of cryptographic soft- ware, hardware acceleration, experimentation, and integra- tion with the existing ecosystem. In this short report, we summarize the inaugural edition of the workshop, which was held on July 10, 2025 at Boston University.
more » « less
Free, publicly-accessible full text available September 8, 2026
Secure Account Recovery for a Privacy-Preserving Web Service

Little, Ryan; Qin, Lucy; Varia, Mayank (August 2024, USENIX Association)

If a web service is so secure that it does not even know---and does not want to know---the identity and contact info of its users, can it still offer account recovery if a user forgets their password? This paper is the culmination of the authors' work to design a cryptographic protocol for account recovery for use by a prominent secure matching system: a web-based service that allows survivors of sexual misconduct to become aware of other survivors harmed by the same perpetrator. In such a system, the list of account-holders must be safeguarded, even against the service provider itself. In this work, we design an account recovery system that, on the surface, appears to follow the typical workflow: the user types in their email address, receives an email containing a one-time link, and answers some security questions. Behind the scenes, the defining feature of our recovery system is that the service provider can perform email-based account validation without knowing, or being able to learn, a list of users' email addresses. Our construction uses standardized cryptography for most components, and it has been deployed in production at the secure matching system. As a building block toward our main construction, we design a new cryptographic primitive that may be of independent interest: an oblivious pseudorandom function that can either have a fully-private input or a partially-public input, and that reaches the same output either way. This primitive allows us to perform online rate limiting for account recovery attempts, without imposing a bound on the creation of new accounts. We provide an open-source implementation of this primitive and provide evaluation results showing that the end-to-end interaction time takes 8.4-60.4 ms in fully-private input mode and 3.1-41.2 ms in partially-public input mode.
more » « less
Full Text Available
QueryShield: Cryptographically Secure Analytics in the Cloud

https://doi.org/10.1145/3626246.3654749

Seow, Ethan; Tong, Yan; Baum, Eli; Buxbaum, Sam; Faisal, Muhammad; Liagouris, John; Kalavri, Vasiliki; Varia, Mayank (June 2024, ACM)

Full Text Available
TVA: A multi-party computation system for secure and expressive time series analytics

Faisal, Muhammad; Zhang, Jerry; Liagouris, John; Vasiliki, Kalavri; Varia, Mayank (August 2023, USENIX Security 2023)

We present TVA, a multi-party computation (MPC) system for secure analytics on secret-shared time series data. TVA achieves strong security guarantees in the semi-honest and malicious settings, and high expressivity by enabling complex analytics on inputs with unordered and irregular timestamps. TVA is the first system to support arbitrary composition of oblivious window operators, keyed aggregations, and multiple filter predicates, while keeping all data attributes private, including record timestamps and user-defined values in query predicates. At the core of the TVA system lie novel protocols for secure window assignment: (i) a tumbling window protocol that groups records into fixed-length time buckets and (ii) two session window protocols that identify periods of activity followed by periods of inactivity. We also contribute a new protocol for secure division with a public divisor, which may be of independent interest. We evaluate TVA on real LAN and WAN environments and show that it can efficiently compute complex window-based analytics on inputs of 2^22 records with modest use of resources. When compared to the state-of-the-art, TVA achieves up to 5.8× lower latency in queries with multiple filters and two orders of magnitude better performance in window aggregation.
more » « less
Full Text Available
Secrecy: Secure collaborative analytics in untrusted clouds

Liagouris, John; Kalavri, Vasiliki; Faisal, Muhammad; Varia, Mayank (April 2023, 20th USENIX Symposium on Networked Systems Design and Implementation)

We present Secrecy, a system for privacy-preserving collaborative analytics as a service. Secrecy allows multiple data holders to contribute their data towards a joint analysis in the cloud, while keeping the data siloed even from the cloud providers. At the same time, it enables cloud providers to offer their services to clients who would have otherwise refused to perform a computation altogether or insisted that it be done on private infrastructure. Secrecy ensures no information leakage and provides provable security guarantees by employing cryptographically secure Multi-Party Computation (MPC). In Secrecy we take a novel approach to optimizing MPC execution by co-designing multiple layers of the system stack and exposing the MPC costs to the query engine. To achieve practical performance, Secrecy applies physical optimizations that amortize the inherent MPC overheads along with logical optimizations that dramatically reduce the computation, communication, and space requirements during query execution. Our multi-cloud experiments demonstrate that Secrecy improves query performance by over 1000x compared to existing approaches and computes complex analytics on millions of data records with modest use of resources.
more » « less
Full Text Available
Can the Government Compel Decryption? Don’t Trust — Verify

Cohen, Aloni; Scheffler, Sarah; Varia, Mayank (November 2022, 2nd ACM Symposium on Computer Science and Law)

If a court knows that a respondent knows the password to a device, can the court compel the respondent to enter that password into the device? In this work, we propose a new approach to the foregone conclusion doctrine from Fisher v. U.S. that governs the answer to this question. The Holy Grail of this line of work would be a framework for reasoning about whether the testimony implicit in any action is already known to the government. In this paper we attempt something narrower. We introduce a framework for specifying actions for which all implicit testimony is, constructively, a foregone conclusion. Our approach is centered around placing the burden of proof on the government to demonstrate that it is not “rely[ing] on the truthtelling” of the respondent. Building on original legal analysis and using precise computer science formalisms, we propose demonstrability as a new central concept for describing compelled acts. We additionally provide a language for whether a compelled action meaningfully entails the respondent to perform in a manner that is “as good as” the government’s desired goal. Then, we apply our definitions to analyze the compellability of several cryptographic primitives including decryption, multifactor authentication, commitment schemes, and hash functions. In particular, our framework reaches a novel conclusion about compelled decryption in the setting that the encryption scheme is deniable: the government can compel but the respondent is free to use any password of her choice.
more » « less
Full Text Available
Formalizing Human Ingenuity: A Quantitative Framework for Copyright Law’s Substantial Similarity

Scheffler, Sarah; Tromer, Eran; Varia, Mayank (November 2022, 2nd ACM Symposium on Computer Science and Law)

A central notion in U.S. copyright law is judging the substantial similarity between an original and an (allegedly) derived work. Capturing this notion has proven elusive, and the many approaches offered by case law and legal scholarship are often ill-defined, contradictory, or internally-inconsistent. This work suggests that key parts of the substantial-similarity puzzle are amendable to modeling inspired by theoretical computer science. Our proposed framework quantitatively evaluates how much "novelty" is needed to produce the derived work with access to the original work, versus reproducing it without access to the copyrighted elements of the original work. "Novelty" is captured by a computational notion of description length, in the spirit of Kolmogorov-Levin complexity, which is robust to mechanical transformations and availability of contextual information. This results in an actionable framework that could be used by courts as an aid for deciding substantial similarity. We evaluate it on several pivotal cases in copyright law and observe that the results are consistent with the rulings, and are philosophically aligned with the abstraction-filtration-comparison test of Altai.
more » « less
Full Text Available
Multi-Regulation Computing: Examining the Legal and Policy Questions That Arise From Secure Multiparty Computation

Walsh, Julissa Milligan; Varia, Mayank; Cohen, Aloni; Bestavros, Azer (November 2022, 2nd ACM Symposium on Computer Science and Law)

This work examines privacy laws and regulations that limit disclosure of personal data, and explores whether and how these restrictions apply when participants use cryptographically secure multi-party computation (MPC). By protecting data during use, MPC offers the promise of conducting data science in a way that (in some use cases) meets or even exceeds most people’s conceptions of data privacy. With MPC, it is possible to correlate individual records across multiple datasets without revealing the underlying records, to conduct aggregate analysis across datasets which parties are otherwise unwilling to share for competitive reasons, and to analyze aggregate statistics across datasets which no individual party may lawfully hold. However, most adoptions of MPC to date involve data that is not subject to privacy protection under the law. We posit that a major impediment to the adoption of MPC—on the data that society has deemed most worthy of protection—is the difficulty of mapping this new technology onto the design principles of data privacy laws. While a computer scientist might reasonably believe that transforming any data analysis into its privacy-protective variant using MPC is a clear win, we show in this work that the technological guarantees of MPC do not directly imply compliance with privacy laws. Specifically, a lawyer will likely want to ask several important questions about the pre-conditions that are necessary for MPC to succeed, the risk that data might inadvertently or maliciously be disclosed to someone other than the output party, and what recourse to take if this bad event occurs. We have two goals for this work: explaining why the privacy law questions are nuanced and that the lawyer is correct to proceed cautiously, and providing a framework that lawyers can use to reason systematically about whether and how MPC implicates data privacy laws in the context of a specific use case. Our framework revolves around three questions: a definitional question on whether the encodings still constitute ‘personal data,’ a process question about whether the act of executing MPC constitutes a data disclosure event, and a liability question about what happens if something goes wrong. We conclude by providing advice to regulators and suggestions to early adopters to spur uptake of MPC. It is our hope that this work provides the first step toward a methodology that organizations can use when contemplating the use of MPC.
more » « less
Full Text Available
Hecate: Abuse Reporting in Secure Messengers with Sealed Sender

Issa, Rawane; Alhaddad, Nicolas; Varia, Mayank (August 2022, 31st USENIX Security Symposium)

End-to-end encryption provides strong privacy protections to billions of people, but it also complicates efforts to moderate content that can seriously harm people. To address this concern, Tyagi et al. [CRYPTO 2019] introduced the concept of asymmetric message franking (AMF) so that people can report abusive content to a moderator, while otherwise retaining end-to-end privacy by default and compatibility with anonymous communication systems like Signal’s sealed sender. In this work, we provide a new construction for asymmetric message franking called Hecate that is faster, more secure, and introduces additional functionality compared to Tyagi et al. First, our construction uses fewer invocations of standardized crypto primitives and operates in the plain model. Second, on top of AMF’s accountability and deniability requirements, we also add forward and backward secrecy. Third, we combine AMF with source tracing, another approach to content moderation that has previously been considered only in the setting of non-anonymous networks. Source tracing allows for messages to be forwarded, and a report only identifies the original source who created a message. To provide anonymity for senders and forwarders, we introduce a model of AMF with preprocessing whereby every client authenticates with the moderator out-of-band to receive a token that they later consume when sending a message anonymously.
more » « less
Full Text Available

« Prev Next »

Search for: All records