skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Lagrange Coded Computing with Sparsity Constraints
In this paper, we propose a distributed coding scheme that allows for lower computation cost per computing node than the standard Lagrange Coded Computing scheme. The proposed coding scheme is useful for cases where the elements of the input data set are of large dimensions and the computing nodes have limited computation power. This coding scheme provides a trade-off between the computation cost per worker and the recovery threshold in a distributed coded computing framework. The proposed scheme is also extended to provide data privacy against at most t colluding worker nodes in the system.  more » « less
Award ID(s):
1763657
PAR ID:
10180249
Author(s) / Creator(s):
;
Date Published:
Journal Name:
2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)
Page Range / eLocation ID:
284 to 289
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    A major hurdle in machine learning is scalability to massive datasets. Approaches to overcome this hurdle include compression of the data matrix and distributing the computations. Leverage score sampling provides a compressed approximation of a data matrix using an importance weighted subset. Gradient coding has been recently proposed in distributed optimization to compute the gradient using multiple unreliable worker nodes. By designing coding matrices, gradient coded computations can be made resilient to stragglers, which are nodes in a distributed network that degrade system performance. We present a novel weighted leverage score approach, that achieves improved performance for distributed gradient coding by utilizing an importance sampling. 
    more » « less
  2. null (Ed.)
    This paper studies the distributed linearly separable computation problem, which is a generalization of many existing distributed computing problems such as distributed gradient coding and distributed linear transform. A master asks N distributed workers to compute a linearly separable function of K datasets, which is a set of Kc linear combinations of K equal-length messages (each message is a function of one dataset). We assign some datasets to each worker in an uncoded manner, who then computes the corresponding messages and returns some function of these messages, such that from the answers of any Nr out of N workers the master can recover the task function with high probability. In the literature, the specific case where Kc = 1 or where the computation cost is minimum has been considered. In this paper, we focus on the general case (i.e., general Kc and general computation cost) and aim to find the minimum communication cost. We first propose a novel converse bound on the communication cost under the constraint of the popular cyclic assignment (widely considered in the literature), which assigns the datasets to the workers in a cyclic way. Motivated by the observation that existing strategies for distributed computing fall short of achieving the converse bound, we propose a novel distributed computing scheme for some system parameters. The proposed computing scheme is optimal for any assignment when Kc is large and is optimal under the cyclic assignment when the numbers of workers and datasets are equal or Kc is small. In addition, it is order optimal within a factor of 2 under the cyclic assignment for the remaining cases. 
    more » « less
  3. Distributed learning platforms for processing large scale data-sets are becoming increasingly prevalent. In typical distributed implementations, a centralized master node breaks the data-set into smaller batches for parallel processing across distributed workers to achieve speed-up and efficiency. Several computational tasks are of sequential nature, and involve multiple passes over the data. At each iteration over the data, it is common practice to randomly re-shuffle the data at the master node, assigning different batches for each worker to process. This random re-shuffling operation comes at the cost of extra communication overhead, since at each shuffle, new data points need to be delivered to the distributed workers. In this paper, we focus on characterizing the information theoretically optimal communication overhead for the distributed data shuffling problem. We propose a novel coded data delivery scheme for the case of no excess storage, where every worker can only store the assigned data batches under processing. Our scheme exploits a new type of coding opportunity and is applicable to any arbitrary shuffle, and for any number of workers. We also present information theoretic lower bounds on the minimum communication overhead for data shuffling, and show that the proposed scheme matches this lower bound for the worst-case communication overhead. 
    more » « less
  4. Stragglers, Byzantine workers, and data privacy are the main bottlenecks in distributed cloud computing. Some prior works proposed coded computing strategies to jointly address all three challenges. They require either a large number of workers, a significant communication cost or a significant computational complexity to tolerate Byzantine workers. Much of the overhead in prior schemes comes from the fact that they tightly couple coding for all three problems into a single framework. In this paper, we propose Adaptive Verifiable Coded Computing (AVCC) framework that decouples the Byzantine node detection challenge from the straggler tolerance. AVCC leverages coded computing just for handling stragglers and privacy, and then uses an orthogonal approach that leverages verifiable computing to mitigate Byzantine workers. Furthermore, AVCC dynamically adapts its coding scheme to trade-off straggler tolerance with Byzantine protection. We evaluate AVCC on a compute-intensive distributed logistic regression application. Our experiments show that AVCC achieves up to 4.2× speedup and up to 5.1% accuracy improvement over the state-of-the-art Lagrange coded computing approach (LCC). AVCC also speeds up the conventional uncoded implementation of distributed logistic regression by up to 7.6×, and improves the test accuracy by up to 12.1%. 
    more » « less
  5. null (Ed.)
    This paper considers the MapReduce-like coded distributed computing framework originally proposed by Li et al., which uses coding techniques when distributed computing servers exchange their computed intermediate values, in order to reduce the overall traffic load. In their original model, servers are connected via an error-free common communication bus allowing broadcast transmissions. However, this assumption is one of the major limitations for practical implementations since real-world data centers may have network topologies far more involved than a single broadcast bus. We formulate a topological coded distributed computing problem, where the computing servers communicate with each other through some switch network. By using a special instance of fat-tree topologies, referred to as t-ary fat-tree proposed by Al-Fares et al. which can be built by some inexpensive switches, we propose a coded distributed computing scheme to achieve the optimal max-link communication load (defined as the maximum load over all links) over any network topology. 
    more » « less