skip to main content


Title: ELSA: Secure Aggregation for Federated Learning with Malicious Actors
Federated learning (FL) is an increasingly popular approach for machine learning (ML) in cases where the training dataset is highly distributed. Clients perform local training on their datasets and the updates are then aggregated into the global model. Existing protocols for aggregation are either inefficient, or don’t consider the case of malicious actors in the system. This is a major barrier in making FL an ideal solution for privacy-sensitive ML applications. We present ELSA, a secure aggregation protocol for FL, which breaks this barrier - it is efficient and addresses the existence of malicious actors at the core of its design. Similar to prior work on Prio and Prio+, ELSA provides a novel secure aggregation protocol built out of distributed trust across two servers that keeps individual client updates private as long as one server is honest, defends against malicious clients, and is efficient end-to-end. Compared to prior works, the distinguishing theme in ELSA is that instead of the servers generating cryptographic correlations interactively, the clients act as untrusted dealers of these correlations without compromising the protocol’s security. This leads to a much faster protocol while also achieving stronger security at that efficiency compared to prior work. We introduce new techniques that retain privacy even when a server is malicious at a small added cost of 7-25% in runtime with negligible increase in communication over the case of semi-honest server. Our work improves end-to-end runtime over prior work with similar security guarantees by big margins - single-aggregator RoFL by up to 305x (for the models we consider), and distributed trust Prio by up to 8x.  more » « less
Award ID(s):
1943347
NSF-PAR ID:
10400324
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the IEEE Symposium on Security and Privacy
ISSN:
1063-9578
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Federated learning (FL) is an increasingly popular approach for machine learning (ML) when the training dataset is highly distributed. Clients perform local training on their datasets and the updates are then aggregated into the global model. Existing protocols for aggregation are either inefficient or don’t consider the case of malicious actors in the system. This is a major barrier to making FL an ideal solution for privacy-sensitive ML applications. In this talk, I will present ELSA, a secure aggregation protocol for FL that breaks this barrier - it is efficient and addresses the existence of malicious actors (clients + servers) at the core of its design. Similar to prior work Prio and Prio+, ELSA provides a novel secure aggregation protocol built out of distributed trust across two servers that keeps individual client updates private as long as one server is honest, defends against malicious clients, and is efficient end-to-end. Compared to prior works, the distinguishing theme in ELSA is that instead of the servers generating cryptographic correlations interactively, the clients act as untrusted dealers of these correlations without compromising the protocol’s security. This leads to a much faster protocol while also achieving stronger security at that efficiency compared to prior work. We introduce new techniques that retain privacy even when a server is malicious at a small added cost of 7-25% in runtime with a negligible increase in communication over the case of a semi-honest server. ELSA improves end-to-end runtime over prior work with similar security guarantees by big margins - single-aggregator RoFL by up to 305x (for the models we consider), and distributed-trust Prio by up to 8x (with up to 16x faster server-side protocol). Additionally, ELSA can be run in a bandwidth-saver mode for clients who are geographically bandwidth-constrained - an important property that is missing from prior works. 
    more » « less
  2. Despite the great potential of Federated Learning (FL) in large-scale distributed learning, the current system is still subject to several privacy issues due to the fact that local models trained by clients are exposed to the central server. Consequently, secure aggregation protocols for FL have been developed to conceal the local models from the server. However, we show that, by manipulating the client selection process, the server can circumvent the secure aggregation to learn the local models of a victim client, indicating that secure aggregation alone is inadequate for privacy protection. To tackle this issue, we leverage blockchain technology to propose a verifiable client selection protocol. Owing to the immutability and transparency of blockchain, our proposed protocol enforces a random selection of clients, making the server unable to control the selection process at its discretion. We present security proofs showing that our protocol is secure against this attack. Additionally, we conduct several experiments on an Ethereum-like blockchain to demonstrate the feasibility and practicality of our solution. 
    more » « less
  3. Standard ML relies on training using a centrally collected dataset, while collaborative learning techniques such as Federated Learning (FL) enable data to remain decentralized at client locations. In FL, a central server coordinates the training process, reducing computation and communication expenses for clients. However, this centralization can lead to server congestion and heightened risk of malicious activity or data privacy breaches. In contrast, Peer-to-Peer Learning (P2PL) is a fully decentralized system where nodes manage both local training and aggregation tasks. While P2PL promotes privacy by eliminating the need to trust a single node, it also results in increased computation and communication costs, along with potential difficulties in achieving consensus among nodes. To address the limitations of both FL and P2PL, we propose a hybrid approach called Hubs-and-Spokes Learning (HSL). In HSL, hubs function similarly to FL servers, maintaining consensus but exerting less control over spokes. This paper argues that HSL’s design allows for greater availability and privacy than FL, while reducing computation and communication costs compared to P2PL. Additionally, HSL maintains consensus and integrity in the learning process. 
    more » « less
  4. null (Ed.)
    Edge Computing (EC) has seen a continuous rise in its popularity as it provides a solution to the latency and communication issues associated with edge devices transferring data to remote servers. EC achieves this by bringing the cloud closer to edge devices. Even though EC does an excellent job of solving the latency and communication issues, it does not solve the privacy issues associated with users transferring personal data to the nearby edge server. Federated Learning (FL) is an approach that was introduced to solve the privacy issues associated with data transfers to distant servers. FL attempts to resolve this issue by bringing the code to the data, which goes against the traditional way of sending the data to remote servers. In FL, the data stays on the source device, and a Machine Learning (ML) model used to train the local data is brought to the end device instead. End devices train the ML model using local data and then send the model updates back to the server for aggregation. However, this process of asking random devices to train a model using its local data has potential risks such as a participant poisoning the model using malicious data for training to produce bogus parameters. In this paper, an approach to mitigate data poisoning attacks in a federated learning setting is investigated. The application of the approach is highlighted, and the practical and secure nature of this approach is illustrated as well using numerical results. 
    more » « less
  5. Sherr, Micah ; Shafiq, Zubair (Ed.)
    Private heavy-hitters is a data-collection task where multiple clients possess private bit strings, and data-collection servers aim to identify the most popular strings without learning anything about the clients' inputs. In this work, we introduce PLASMA: a private analytics framework in the three-server setting that protects the privacy of honest clients and the correctness of the protocol against a coalition of malicious clients and a malicious server. Our core primitives are a verifiable incremental distributed point function (VIDPF) and a batched consistency check, which are of independent interest. Our VIDPF introduces new methods to validate client inputs based on hashing. Meanwhile, our batched consistency check uses Merkle trees to validate multiple client sessions together in a batch. This drastically reduces server communication across multiple client sessions, resulting in significantly less communication compared to related works. Finally, we compare PLASMA with the recent works of Asharov et al. (CCS'22) and Poplar (S&P'21) and compare in terms of monetary cost for different input sizes. 
    more » « less