Abstract While the practicality of secure multi-party computation (MPC) has been extensively analyzed and improved over the past decade, we are hitting the limits of efficiency with the traditional approaches of representing the computed functionalities as generic arithmetic or Boolean circuits. This work follows the design principle of identifying and constructing fast and provably-secure MPC protocols to evaluate useful high-level algebraic abstractions; thus, improving the efficiency of all applications relying on them. We present Polymath, a constant-round secure computation protocol suite for the secure evaluation of (multi-variate) polynomials of scalars and matrices, functionalities essential to numerous data-processing applications. Using precise natural precomputation and high-degree of parallelism prevalent in the modern computing environments, Polymath can make latency of secure polynomial evaluations of scalars and matrices independent of polynomial degree and matrix dimensions. We implement our protocols over the HoneyBadgerMPC library and apply it to two prominent secure computation tasks: privacy-preserving evaluation of decision trees and privacy-preserving evaluation of Markov processes. For the decision tree evaluation problem, we demonstrate the feasibility of evaluating high-depth decision tree models in a general n -party setting. For the Markov process application, we demonstrate that Poly-math can compute large powers of transition matrices with better online time and less communication.
more »
« less
From Usability to Secure Computing and Back Again
Secure multi-party computation (MPC) allows multiple parties to jointly compute the output of a function while preserving the privacy of any individual party's inputs to that function. As MPC protocols transition from research prototypes to real-world applications, the usability of MPC-enabled applications is increasingly critical to their successful deployment and wide adoption. Our Web-MPC platform, designed with a focus on usability, has been deployed for privacy-preserving data aggregation initiatives with the City of Boston and the Greater Boston Chamber of Commerce. After building and deploying an initial version of this platform, we conducted a heuristic evaluation to identify additional usability improvements and implemented corresponding application enhancements. However, it is difficult to gauge the effectiveness of these changes within the context of real-world deployments using traditional web analytics tools without compromising the security guarantees of the platform. This work consists of two contributions that address this challenge: (1) the Web-MPC platform has been extended with the capability to collect web analytics using existing MPC protocols, and (2) this capability has been leveraged to conduct a usability study comparing the two version of Web-MPC (before and after the heuristic evaluation and associated improvements). While many efforts have focused on ways to enhance the usability of privacy-preserving technologies, this study can serve as a model for using a privacy-preserving data-driven approach in evaluating or enhancing the usability of privacy-preserving websites and applications deployed in real-world scenarios. The data collected in this study yields insights about the interplay between usability and security that can help inform future implementations of applications that employ MPC.
more »
« less
- PAR ID:
- 10099689
- Date Published:
- Journal Name:
- Fifteenth Symposium on Usable Privacy and Security
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Background Logistic regression (LR) is a widely used classification method for modeling binary outcomes in many medical data classification tasks. Researchers that collect and combine datasets from various data custodians and jurisdictions can greatly benefit from the increased statistical power to support their analysis goals. However, combining data from different sources creates serious privacy concerns that need to be addressed. Methods In this paper, we propose two privacy-preserving protocols for performing logistic regression with the Newton–Raphson method in the estimation of parameters. Our proposals are based on secure Multi-Party Computation (MPC) and tailored to the honest majority and dishonest majority security settings. Results The proposed protocols are evaluated against both synthetic and real-world datasets in terms of efficiency and accuracy, and a comparison is made with the ordinary logistic regression. The experimental results demonstrate that the proposed protocols are highly efficient and accurate. Conclusions Our work introduces two iterative algorithms to enable the distributed training of a logistic regression model in a privacy-preserving manner. The implementation results show that our algorithms can handle large datasets from multiple sources.more » « less
-
null (Ed.)Research and experimentation using big data sets, specifically large sets of electronic health records (EHR) and social media data, is demonstrating the potential to understand the spread of diseases and a variety of other issues. Applications of advanced algorithms, machine learning, and artificial intelligence indicate a potential for rapidly advancing improvements in public health. For example, several reports indicate that social media data can be used to predict disease outbreak and spread (Brown, 2015). Since real-world EHR data has complicated security and privacy issues preventing it from being widely used by researchers, there is a real need to synthetically generate EHR data that is realistic and representative. Current EHR generators, such as Syntheaä (Walonoski et al., 2018) only simulate and generate pure medical-related data. However, adding patients’ social media data with their simulated EHR data would make combined data more comprehensive and realistic for healthcare research. This paper presents a patients’ social media data generator that extends an EHR data generator. By adding coherent social media data to EHR data, a variety of issues can be examined for emerging interests, such as where a contagious patient may have been and others with whom they may have been in contact. Social media data, specifically Twitter data, is generated with phrases indicating the onset of symptoms corresponding to the synthetically generated EHR reports of simulated patients. This enables creation of an open data set that is scalable up to a big-data size, and is not subject to the security, privacy concerns, and restrictions of real healthcare data sets. This capability is important to the modeling and simulation community, such as scientists and epidemiologists who are developing algorithms to analyze the spread of diseases. It enables testing a variety of analytics without revealing real-world private patient information.more » « less
-
In this paper, we propose a new secure machine learning inference platform assisted by a small dedicated security processor, which will be easier to protect and deploy compared to today's TEEs integrated into high-performance processors. Our platform provides three main advantages over the state-of-the-art: (i) We achieve significant performance improvements compared to state-of-the-art distributed Privacy-Preserving Machine Learning (PPML) protocols, with only a small security processor that is comparable to a discrete security chip such as the Trusted Platform Module (TPM) or on-chip security subsystems in SoCs similar to the Apple enclave processor. In the semi-honest setting with WAN/GPU, our scheme is 4X-63X faster than Falcon (PoPETs'21) and AriaNN (PoPETs'22) and 3.8X-12X more communication efficient. We achieve even higher performance improvements in the malicious setting. (ii) Our platform guarantees security with abort against malicious adversaries under honest majority assumption. (iii) Our technique is not limited by the size of secure memory in a TEE and can support high-capacity modern neural networks like ResNet18 and Transformer. While previous work investigated the use of high-performance TEEs in PPML, this work represents the first to show that even tiny secure hardware with very limited performance can be leveraged to significantly speed-up distributed PPML protocols if the protocol can be carefully designed for lightweight trusted hardware.more » « less
-
Software applications that employ secure multi-party computation (MPC) can empower individuals and organizations to benefit from privacy-preserving data analyses when data sharing is encumbered by confidentiality concerns, legal constraints, or corporate policies. MPC is already being incorporated into software solutions in some domains; however, individual use cases do not fully convey the variety, extent, and complexity of the opportunities of MPC. This position paper articulates a role-based perspective that can provide some insight into how future research directions, infrastructure development and evaluation approaches, and deployment practices for MPC may evolve. Drawing on our own lessons from existing real-world deployments and the fundamental characteristics of MPC that make it a compelling technology, we propose a role-based conceptual framework for describing MPC deployment scenarios. Our framework acknowledges and leverages a novel assortment of roles that emerge from the fundamental ways in which MPC protocols support federation of functionalities and responsibilities. Defining these roles using the new opportunities for federation that MPC enables in turn can help identify and organize the capabilities, concerns, incentives, and trade-offs that affect the entities (software engineers, government regulators, corporate executives, end-users, and others) that participate in an MPC deployment scenario. This framework can not only guide the development of an ecosystem of modular and composable MPC tools, but can make explicit some of the opportunities that researchers and software engineers (and any organizations they form) have to differentiate and specialize the artifacts and services they choose to design, develop, and deploy. We demonstrate how this framework can be used to describe existing MPC deployment scenarios, how new opportunities in a scenario can be observed by disentangling roles inhabited by the involved parties, and how this can motivate the development of MPC libraries and software tools that specialize not by application domain but by role.more » « less