NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

On the Rényi Differential Privacy of the Shuffle Model

https://doi.org/10.1145/3460120.3484794

Girgis, Antonious M.; Data, Deepesh; Diggavi, Suhas; Suresh, Ananda Theertha; Kairouz, Peter (November 2021, ACM Symposium on Computer and Communication Security (CCS))

The central question studied in this paper is Rényi Differential Privacy (RDP) guarantees for general discrete local randomizers in the shuffle privacy model. In the shuffle model, each of the 𝑛 clients randomizes its response using a local differentially private (LDP) mechanism and the untrusted server only receives a random permutation (shuffle) of the client responses without association to each client. The principal result in this paper is the first direct RDP bounds for general discrete local randomization in the shuffle pri- vacy model, and we develop new analysis techniques for deriving our results which could be of independent interest. In applications, such an RDP guarantee is most useful when we use it for composing several private interactions. We numerically demonstrate that, for important regimes, with composition our bound yields an improve- ment in privacy guarantee by a factor of 8× over the state-of-the-art approximate Differential Privacy (DP) guarantee (with standard composition) for shuffle models. Moreover, combining with Pois- son subsampling, our result leads to at least 10× improvement over subsampled approximate DP with standard composition.
more » « less
Full Text Available
Byzantine-Resilient SGD in High Dimensions on Heterogeneous Data

https://doi.org/10.1109/ISIT45174.2021.9518248

Data, Deepesh; Diggavi, Suhas (July 2021, IEEE International Symposium on Information Theory (ISIT))

We study distributed stochastic gradient descent (SGD) in the master-worker architecture under Byzantine attacks. We consider the heterogeneous data model, where different workers may have different local datasets, and we do not make any probabilistic assumptions on data generation. At the core of our algorithm, we use the polynomial-time outlier-filtering procedure for robust mean estimation proposed by Steinhardt et al. (ITCS 2018) to filter-out corrupt gradients. In order to be able to apply their filtering procedure in our heterogeneous data setting where workers compute stochastic gradients, we derive a new matrix concentration result, which may be of independent interest. We provide convergence analyses for smooth strongly-convex and non-convex objectives and show that our convergence rates match that of vanilla SGD in the Byzantine-free setting. In order to bound the heterogeneity, we assume that the gradients at different workers have bounded deviation from each other, and we also provide concrete bounds on this deviation in the statistical heterogeneous data model.
more » « less
Full Text Available
Differentially Private Federated Learning with Shuffling and Client Self-Sampling

https://doi.org/10.1109/ISIT45174.2021.9517906

Girgis, Antonious M.; Data, Deepesh; Diggavi, Suhas (July 2021, IEEE International Symposium on Information Theory (ISIT))

Full Text Available
Byzantine-Resilient High-Dimensional SGD with Local Iterations on Heterogeneous Data

Data, Deepesh; Diggavi, Suhas N. (July 2021, International Conference on Machine Learning (ICML))

We study stochastic gradient descent (SGD) with local iterations in the presence of malicious/Byzantine clients, motivated by the federated learning. The clients, instead of communicating with the central server in every iteration, maintain their local models, which they update by taking several SGD iterations based on their own datasets and then communicate the net update with the server, thereby achieving communication-efficiency. Furthermore, only a subset of clients communicate with the server, and this subset may be different at different synchronization times. The Byzantine clients may collaborate and send arbitrary vectors to the server to disrupt the learning process. To combat the adversary, we employ an efficient high-dimensional robust mean estimation algorithm from Steinhardt et al.~i̧te[ITCS 2018]Resilience_SCV18 at the server to filter-out corrupt vectors; and to analyze the outlier-filtering procedure, we develop a novel matrix concentration result that may be of independent interest. We provide convergence analyses for strongly-convex and non-convex smooth objectives in the heterogeneous data setting, where different clients may have different local datasets, and we do not make any probabilistic assumptions on data generation. We believe that ours is the first Byzantine-resilient algorithm and analysis with local iterations. We derive our convergence results under minimal assumptions of bounded variance for SGD and bounded gradient dissimilarity (which captures heterogeneity among local datasets). We also extend our results to the case when clients compute full-batch gradients.
more » « less
Full Text Available
Differentially Private Federated Learning with Shuffling and Client Self-Sampling

Girgis, Antonious M; Data, Deepesh; and Diggavi, Suhas. (July 2021, IEEE International Symposium on Information Theory (ISIT))

This paper studies a distributed optimization problem in the federated learning (FL) framework under differential privacy constraints, whereby a set of clients having local samples are connected to an untrusted server, who wants to learn a global model while preserving the privacy of clients’ local datasets. We propose a new client sampling called self-sampling that reflects the random availability of clients in the learning process in FL. We analyze the differential privacy of the SGD with client self-sampling by composing amplification by sub-sampling along with amplification by shuffling. Furthermore, we analyze the convergence of the proposed SGD algorithm showing that we can get a reasonable learning performance while preserving the privacy of clients’ data even with client self-sampling.
more » « less
Full Text Available
Byzantine-Resilient SGD in High Dimensions on Heterogeneous Data

Data, Deepesh; Diggavi, Suhas N. (July 2021, IEEE International Symposium on Information Theory (ISIT))

We study distributed stochastic gradient descent (SGD) in the master-worker architecture under Byzantine at- tacks. We consider the heterogeneous data model, where different workers may have different local datasets, and we do not make any probabilistic assumptions on data generation. At the core of our algorithm, we use the polynomial-time outlier-filtering procedure for robust mean estimation proposed by Steinhardt et al. (ITCS 2018) to filter-out corrupt gradients. In order to be able to apply their filtering procedure in our heterogeneous data setting where workers compute stochastic gradients, we derive a new matrix concentration result, which may be of independent interest. We provide convergence analyses for smooth strongly- convex and non-convex objectives and show that our convergence rates match that of vanilla SGD in the Byzantine-free setting. In order to bound the heterogeneity, we assume that the gradients at different workers have bounded deviation from each other, and we also provide concrete bounds on this deviation in the statistical heterogeneous data model.
more » « less
Full Text Available
Algorithms for reconstruction over single and multiple deletion channels

Sundara Rajan Srinivasavaradhan, Michelle Du (June 2021, IEEE transactions on information theory)

Recent advances in DNA sequencing technology and DNA storage systems have rekindled the interest in deletion channels. Multiple recent works have looked at variants of sequence reconstruction over a single and over multiple deletion channels, a notoriously difficult problem due to its highly combinatorial nature. Although works in theoretical computer science have provided algorithms which guarantee perfect reconstruction with multiple independent observations from the deletion channel, they are only applicable in the large blocklength regime and more restrictively, when the number of observations is also large. Indeed, with only a few observations, perfect reconstruction of the input sequence may not even be possible in most cases. In such situations, maximum likelihood (ML) and maximum aposteriori (MAP) estimates for the deletion channels are natural questions that arise and these have remained open to the best of our knowledge. In this work, we take steps to answer the two aforementioned questions. Specifically: 1. We show that solving for the ML estimate over the single deletion channel (which can be cast as a discrete optimization problem) is equivalent to solving its relaxation, a continuous optimization problem; 2. We exactly compute the symbolwise posterior distributions (under some assumptions on the priors) for both the single as well as multiple deletion channels. As part of our contributions, we also introduce tools to visualize and analyze error events, which we believe could be useful in other related problems concerning deletion channels.
more » « less
Full Text Available
Distortion-Based Lightweight Security for Cyber-Physical Systems

https://doi.org/10.1109/TAC.2020.3006814

Agarwal, Gaurav Kumar; Karmoose, Mohammed; Diggavi, Suhas; Fragouli, Christina; Tabuada, Paulo (April 2021, IEEE Transactions on Automatic Control)

Full Text Available
Shuffled Model of Federated Learning: Privacy, Accuracy and Communication Trade-Offs

https://doi.org/10.1109/JSAIT.2021.3056102

Girgis, Antonious M.; Data, Deepesh; Diggavi, Suhas; Kairouz, Peter; Suresh, Ananda Theertha (March 2021, IEEE Journal on Selected Areas in Information Theory)

Full Text Available
Data Encoding for Byzantine-Resilient Distributed Optimization

https://doi.org/10.1109/TIT.2020.3035868

Data, Deepesh; Song, Linqi; Diggavi, Suhas N. (February 2021, IEEE Transactions on Information Theory)

Full Text Available

« Prev Next »

Search for: All records