skip to main content


Title: GroupSecAgg: Information Theoretic Secure Aggregation with Uncoded Groupwise Keys
Secure aggregation, which is a core component of federated learning, aggregates locally trained models from distributed users at a central server, without revealing any other information about the local users' data. This paper follows a recent information theoretic secure aggregation problem with user dropouts, where the objective is to characterize the minimum communication cost from the K users to the server during the model aggregation. All existing secure aggregation protocols let the users share and store coded keys to guarantee security. On the motivation that uncoded groupwise keys are more convenient to be shared and could be used in large range of practical applications, this paper is the first to consider uncoded groupwise keys, where the keys are mutually independent and each key is shared by a group of S users. We show that if S is beyond a threshold, a new secure aggregation protocol with uncoded groupwise keys, referred to as GroupSecAgg, can achieve the same optimal communication cost as the best protocol with coded keys. The experiments on Amazon EC2 show the considerable improvements on the key sharing and model aggregation times compared to the state-of-the art.  more » « less
Award ID(s):
2145835
PAR ID:
10490350
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
ICC 2023 - IEEE International Conference on Communications
ISSN:
1938-1883
ISBN:
978-1-5386-7462-8
Page Range / eLocation ID:
3890 to 3895
Format(s):
Medium: X
Location:
Rome, Italy
Sponsoring Org:
National Science Foundation
More Like this
  1. Secure aggregation, which is a core component of federated learning, aggregates locally trained models from distributed users at a central server. The “secure” nature of such aggregation consists of the fact that no information about the local users’ data must be leaked to the server except the aggregated local models. In order to guarantee security, some keys may be shared among the users (this is referred to as the key sharing phase). After the key sharing phase, each user masks its trained model which is then sent to the server (this is referred to as the model aggregation phase). This paper follows the information theoretic secure aggregation problem originally formulated by Zhao and Sun, with the objective to characterize the minimum communication cost from the K users in the model aggregation phase. Due to user dropouts, which are common in real systems, the server may not receive all messages from the users. A secure aggregation scheme should tolerate the dropouts of at most K – U users, where U is a system parameter. The optimal communication cost is characterized by Zhao and Sun, but with the assumption that the keys stored by the users could be any random variables with arbitrary dependency. On the motivation that uncoded groupwise keys are more convenient to be shared and could be used in large range of applications besides federated learning, in this paper we add one constraint into the above problem, namely, that the key variables are mutually independent and each key is shared by a group of S users, where S is another system parameter. To the best of our knowledge, all existing secure aggregation schemes (with information theoretic security or computational security) assign coded keys to the users. We show that if S > K–U, a new secure aggregation scheme with uncoded groupwise keys can achieve the same optimal communication cost as the best scheme with coded keys; if S ≤ K – U, uncoded groupwise key sharing is strictly sub-optimal. Finally, we also implement our proposed secure aggregation scheme into Amazon EC2, which are then compared with the existing secure aggregation schemes with offline key sharing. 
    more » « less
  2. This paper considers the secure aggregation problem for federated learning under an information theoretic cryptographic formulation, where distributed training nodes (referred to as users) train models based on their own local data and a curious-but-honest server aggregates the trained models without retrieving other information about users’ local data. Secure aggregation generally contains two phases, namely key sharing phase and model aggregation phase. Due to the common effect of user dropouts in federated learning, the model aggregation phase should contain two rounds, where in the first round the users transmit masked models and, in the second round, according to the identity of surviving users after the first round, these surviving users transmit some further messages to help the server decrypt the sum of users’ trained models. The objective of the considered information theoretic formulation is to characterize the capacity region of the communication rates from the users to the server in the two rounds of the model aggregation phase, assuming that key sharing has already been performed offline in prior. In this context, Zhao and Sun completely characterized the capacity region under the assumption that the keys can be arbitrary random variables. More recently, an additional constraint, known as “uncoded groupwise keys,” has been introduced. This constraint entails the presence of multiple independent keys within the system, with each key being shared by precisely S users, where S is a defined system parameter. The capacity region for the information theoretic secure aggregation problem with uncoded groupwise keys was established in our recent work subject to the condition S > K - U, where K is the number of total users and U is the designed minimum number of surviving users (which is another system parameter). In this paper we fully characterize the capacity region for this problem by matching a new converse bound and an achievable scheme. Experimental results over the Tencent Cloud show the improvement on the model aggregation time compared to the original secure aggregation scheme. 
    more » « less
  3. null (Ed.)
    In this paper, we explore the use of microcontrollers (MCUs) and crypto coprocessors to secure IoT applications, and show how developers may implement a low-cost platform that provides protects private keys against software attacks. We first demonstrate the plausibility of format string attacks on the ESP32, a popular MCU from Espressif that uses the Harvard architecture. The format string attacks can be used to remotely steal private keys hard-coded in the firmware. We then present a framework termed SIC 2 (Securing IoT with Crypto Coprocessors), for secure key provisioning that protects end users' private keys from both software attacks and untrustworthy manufacturers. As a proof of concept, we pair the ESP32 with the low-cost ATECC608A cryptographic coprocessor by Microchip and connect to Amazon Web Services (AWS) and Amazon Elastic Container Service (EC2) using a hardware-protected private key, which provides the security features of TLS communication including authentication, encryption and integrity. We have developed a prototype and performed extensive experiments to show that the ATECC608A crypto chip may significantly reduce the TLS handshake time by as much as 82% with the remote server, and it may lower the total energy consumption of the system by up to 70%. Our results indicate that securing IoT with crypto coprocessors is a practicable solution for low-cost MCU based IoT devices. 
    more » « less
  4. Coded caching is a new approach to decrease the communication load during the peak hours of the network. It provides a significant gain, that is maximized in the centralized setting, where the server controls the placement. In many situations, each user fills its cache without any information about the placement of other users. We show that subspace precoding for placement improves the delivery load of a decentralized caching system compared to uncoded placement. Surprisingly, the proposed scheme achieves the delivery load of the centralized placement for K = 3 users for the entire range of cache size. 
    more » « less
  5. In order to preserve the privacy of the users demands from other users, in this paper we formulate a novel information theoretic Device-to-Device (D2D) private caching model by adding a trusted server. In the delivery phase, the trusted server collects the users demands and sends a query to each user, who then broadcasts packets according to this query. Two D2D private caching schemes (uncoded and coded) are proposed in this paper, which are shown to be order optimal. 
    more » « less