This work introduces a Byzantine resilient so- lution for learning low-dimensional linear rep- resentation. Our main contribution is the de- velopment of a provably Byzantine-resilient Alt- GDmin algorithm for solving this problem in a federated setting. We argue that our solution is sample-efficient, fast, and communication- efficient. In solving this problem, we also intro- duce a novel secure solution to the federated sub- space learning meta-problem that occurs in many different applications.
more »
« less
Byzantine Resilient and Fast Federated Few-Shot Learning
This work introduces a Byzantine resilient so- lution for learning low-dimensional linear rep- resentation. Our main contribution is the de- velopment of a provably Byzantine-resilient Alt- GDmin algorithm for solving this problem in a federated setting. We argue that our solution is sample-efficient, fast, and communication- efficient. In solving this problem, we also intro- duce a novel secure solution to the federated sub- space learning meta-problem that occurs in many different applications.
more »
« less
- Award ID(s):
- 2341359
- PAR ID:
- 10597126
- Publisher / Repository:
- Proceedings of the 41st International Conference on Machine Learning (ICML) 2024
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This work considers two related learning problems in a federated attack-prone setting – federated principal com- ponents analysis (PCA) and federated low rank column-wise sensing (LRCS). The node attacks are assumed to be Byzan- tine which means that the attackers are omniscient and can collude. We introduce a novel provably Byzantine-resilient communication-efficient and sample-efficient algorithm, called Subspace-Median, that solves the PCA problem and is a key part of the solution for the LRCS problem. We also study the most natural Byzantine-resilient solution for federated PCA, a geometric median based modification of the federated power method, and explain why it is not useful. Our second main contribution is a complete alternating gradient descent (GD) and minimization (altGDmin) algorithm for Byzantine-resilient horizontally federated LRCS and sample and communication complexity guarantees for it. Extensive simulation experiments are used to corroborate our theoretical guarantees. The ideas that we develop for LRCS are easily extendable to other LR recovery problems as well.more » « less
-
Privacy and Byzantine resilience are two indispensable requirements for a federated learning (FL) system. Although there have been extensive studies on privacy and Byzantine security in their own track, solutions that consider both remain sparse. This is due to difficulties in reconciling privacy-preserving and Byzantine-resilient algorithms. In this work, we propose a solution to such a two-fold issue. We use our version of differentially private stochastic gradient descent (DP-SGD) algorithm to preserve privacy and then apply our Byzantine-resilient algorithms. We note that while existing works follow this general approach, an in-depth analysis on the interplay between DP and Byzantine resilience has been ignored, leading to unsatisfactory performance. Specifically, for the random noise introduced by DP, previous works strive to reduce its seemingly detrimental impact on the Byzantine aggregation. In contrast, we leverage the random noise to construct a first-stage aggregation that effectively rejects many existing Byzantine attacks. Moreover, based on another property of our DP variant, we form a second-stage aggregation which provides a final sound filtering. Our protocol follows the principle of co-designing both DP and Byzantine resilience. We provide both theoretical proof and empirical experiments to show our protocol is effective: retaining high accuracy while preserving the DP guarantee and Byzantine resilience. Compared with the previous work, our protocol 1) achieves significantly higher accuracy even in a high privacy regime; 2) works well even when up to 90% distributive workers are Byzantine.more » « less
-
null (Ed.)We consider the problem of multiagent optimization wherein an unknown subset of agents suffer Byzantine faults and thus behave adversarially. We assume that each agent i has a local cost function fi , and the overarching goal of the good agents is to collaboratively minimize a global objective that properly aggregates these local cost functions. To the best of our knowledge, we are among the first to study Byzantine-resilient optimization where no central coordinating agent exists, and we are the first to characterize the structures of the convex coefficients of the achievable global objectives. Dealing with Byzantine faults is very challenging. For example, in contrast to fault-free networks, reaching Byzantine-resilient agreement even in the simplest setting is far from trivial. We take a step toward solving the proposed Byzantine-resilient multiagent optimization problem by focusing on scalar local cost functions. Our results might provide useful insights for the general local cost functions.more » « less
-
We study stochastic gradient descent (SGD) with local iterations in the presence of malicious/Byzantine clients, motivated by the federated learning. The clients, instead of communicating with the central server in every iteration, maintain their local models, which they update by taking several SGD iterations based on their own datasets and then communicate the net update with the server, thereby achieving communication-efficiency. Furthermore, only a subset of clients communicate with the server, and this subset may be different at different synchronization times. The Byzantine clients may collaborate and send arbitrary vectors to the server to disrupt the learning process. To combat the adversary, we employ an efficient high-dimensional robust mean estimation algorithm from Steinhardt et al.~i̧te[ITCS 2018]Resilience_SCV18 at the server to filter-out corrupt vectors; and to analyze the outlier-filtering procedure, we develop a novel matrix concentration result that may be of independent interest. We provide convergence analyses for strongly-convex and non-convex smooth objectives in the heterogeneous data setting, where different clients may have different local datasets, and we do not make any probabilistic assumptions on data generation. We believe that ours is the first Byzantine-resilient algorithm and analysis with local iterations. We derive our convergence results under minimal assumptions of bounded variance for SGD and bounded gradient dissimilarity (which captures heterogeneity among local datasets). We also extend our results to the case when clients compute full-batch gradients.more » « less
An official website of the United States government

