skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Byzantine-Tolerant Distributed Coordinate Descent
We study distributed coordinate descent (CD) in the master-worker architecture under adversarial attacks, where the data is partitioned (across the parameter space) and distributed among m worker nodes (t of which can be maliciously corrupt), which update some coordinates of their part of the parameter vector, in parallel and iteratively, using CD updates, with the help of the master. We propose a method based on data encoding and real error correction to combat the adversary. Our method can tolerate up to ⌈m-1/2⌉ corrupt nodes, which is information-theoretically optimal. Our design gives a trade-off between the resiliency t, the required redundancy, and the computation at master and worker nodes. For example, with constant overhead in the storage and computational complexity over that required by the plain distributed CD, we can tolerate up to m/3 corrupt nodes. We design a sparse encoding scheme, which yields low encoding complexity.  more » « less
Award ID(s):
1740047
PAR ID:
10185990
Author(s) / Creator(s):
;
Date Published:
Journal Name:
IEEE International Symposium on Information Theory
Page Range / eLocation ID:
2724 to 2728
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider distributed gradient computation, where both data and computation are distributed among m worker machines, t of which can be Byzantine adversaries, and a designated (master) node computes the model/parameter vector for generalized linear models, iteratively, using proximal gradient descent (PGD), of which gradient descent (GD) is a special case. The Byzantine adversaries can (collaboratively) deviate arbitrarily from their gradient computation. To solve this, we propose a method based on data encoding and (real) error correction to combat the adversarial behavior. We can tolerate up to t <= (m−1)/2 corrupt worker nodes, which is 2 information-theoretically optimal. Our method does not assume any probability distribution on the data. We develop a sparse encoding scheme which enables computationally efficient data encoding. We demonstrate a trade-off between the number of adversaries tolerated and the resource requirement (storage and computational complexity). As an example, our scheme incurs a constant overhead (storage and computational complexity) over that required by the distributed PGD algorithm, without adversaries, for t <= m . Our encoding works as efficiently in the streaming data setting as it does in the offline setting, in which all the data is available beforehand. 
    more » « less
  2. We study stochastic gradient descent (SGD) in the master-worker architecture under Byzantine attacks. Building upon the recent advances in algorithmic high-dimensional robust statistics, in each SGD iteration, master employs a non-trivial decoding to estimate the true gradient from the unbiased stochastic gradients received from workers, some of which may be corrupt. We provide convergence analyses for both strongly-convex and non-convex smooth objectives under standard SGD assumptions. We can control the approximation error of our solution in both these settings by the mini-batch size of stochastic gradients; and we can make the approximation error as small as we want, provided that workers use a sufficiently large mini-batch size. Our algorithm can tolerate less than 1/3 fraction of Byzantine workers. It can approximately find the optimal parameters in the strongly-convex setting exponentially fast, and reaches to an approximate stationary point in the non-convex setting with linear speed, i.e., with a rate of 1/T, thus, matching the convergence rates of vanilla SGD in the Byzantine-free setting. 
    more » « less
  3. Distributed Key Generation (DKG) is a technique to bootstrap threshold cryptosystems without a trusted third party and is a building block to decentralized protocols such as randomness beacons, threshold signatures, and general multiparty computation. Until recently, DKG protocols have assumed the synchronous model and thus are vulnerable when their underlying network assumptions do not hold. The recent advancements in asynchronous DKG protocols are insufficient as they either have poor efficiency or limited functionality, resulting in a lack of concrete implementations. In this paper, we present a simple and concretely efficient asynchronous DKG (ADKG) protocol. In a network of n nodes, our ADKG protocol can tolerate up to t < n/3 malicious nodes and have an expected O(κn^3) communication cost, where κ is the security parameter. Our ADKG protocol produces a field element as the secret and is thus compatible with off-the-shelf threshold cryptosystems. We implement our ADKG protocol and evaluate it using a network of up to 128 nodes in geographically distributed AWS instances. Our evaluation shows that our protocol takes as low as 3 and 9.5 seconds to terminate for 32 and 64 nodes, respectively. Also, each node sends only 0.7 Megabytes and 2.9 Megabytes of data during the two experiments, respectively. 
    more » « less
  4. We consider the problem of distributed corruption detection in networks. In this model each node of a directed graph is either truthful or corrupt. Each node reports the type (truthful or corrupt) of each of its outneighbors. If it is truthful, it reports the truth, whereas if it is corrupt, it reports adversarially. This model, first considered by Preparata, Metze and Chien in 1967, motivated by the desire to identify the faulty components of a digital system by having the other components checking them, became known as the PMC model. The main known results for this model characterize networks in which all corrupt (that is, faulty) nodes can be identified, when there is a known upper bound on their number. We are interested in networks in which a large fraction of the nodes can be classified. It is known that in the PMC model, in order to identify all corrupt nodes when their number is t, all in-degrees have to be at least t. In contrast, we show that in d regular-graphs with strong expansion properties, a 1 - O(1/d) fraction of the corrupt nodes, and a 1 - O(1/d) fraction of the truthful nodes can be identified, whenever there is a majority of truthful nodes. We also observe that if the graph is very far from being a good expander, namely, if the deletion of a small set of nodes splits the graph into small components, then no corruption detection is possible even if most of the nodes are truthful. Finally we discuss the algorithmic aspects and the computational hardness of the problem. 
    more » « less
  5. We propose distributed scheduling algorithms that guarantee a constant fraction of the maximum throughput for typical wireless topologies, and have O(1) delay and complexity in the network size. Our algorithms resolve collisions among pairs of conflicting nodes by assigning a master-slave hierarchy. When the master-slave hierarchy is chosen randomly, our algorithm matches the throughput performance of the maximal scheduling policies, with a complexity and delay that do not scale with network size. When the master-slave hierarchy is chosen based on the network topology, the throughput performance of our algorithm is characterized by a parameter of the conflict graph called the master-interference degree. For commonly-used conflict-graph topologies, our results lead to the best known throughput guarantees among the algorithms that have O(1) delay and complexity. Numerical results indicate that our algorithms outperform the existing O(1) complexity algorithms like Q-CSMA. 
    more » « less