skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Gradient Coding With Iterative Block Leverage Score Sampling
Gradient coding is a method for mitigating straggling servers in a centralized computing network that uses erasure-coding techniques to distributively carry out first-order optimization methods. Randomized numerical linear algebra uses randomization to develop improved algorithms for large-scale linear algebra computations. In this paper, we propose a method for distributed optimization that combines gradient coding and randomized numerical linear algebra. The proposed method uses a randomized ℓ2 -subspace embedding and a gradient coding technique to distribute blocks of data to the computational nodes of a centralized network, and at each iteration the central server only requires a small number of computations to obtain the steepest descent update. The novelty of our approach is that the data is replicated according to importance scores, called block leverage scores, in contrast to most gradient coding approaches that uniformly replicate the data blocks. Furthermore, we do not require a decoding step at each iteration, avoiding a bottleneck in previous gradient coding schemes. We show that our approach results in a valid ℓ2 -subspace embedding, and that our resulting approximation converges to the optimal solution.  more » « less
Award ID(s):
2037304
PAR ID:
10553112
Author(s) / Creator(s):
; ;
Publisher / Repository:
IEEE Transactions on Information Theory
Date Published:
Journal Name:
IEEE Transactions on Information Theory
Volume:
70
Issue:
9
ISSN:
0018-9448
Page Range / eLocation ID:
6639 to 6664
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. A new method is presented, consisting of exclusively simple linear algebra computations, for computing the linear programming Shannon outer bound to the network coding capacity region of a directed hypergraph network. This linear algebraic formulation enables a new upper bound on the worst case complexity of computing the Shannon outer bound to a network coding capacity region to be determined. 
    more » « less
  2. Ye, Qiang (Ed.)
    An inexact rational Krylov subspace method is studied to solve large-scale nonsymmetric eigenvalue problems. Each iteration (outer step) of the rational Krylov subspace method requires solution to a shifted linear system to enlarge the subspace, performed by an iterative linear solver for large-scale problems. Errors are introduced at each outer step if these linear systems are solved approx- imately by iterative methods (inner step), and they accumulate in the rational Krylov subspace. In this article, we derive an upper bound on the errors intro- duced at each outer step to maintain the same convergence as exact rational Krylov subspace method for approximating an invariant subspace. Since this bound is inversely proportional to the current eigenresidual norm of the target invariant subspace, the tolerance of iterative linear solves at each outer step can be relaxed with the outer iteration progress. A restarted variant of the inexact rational Krylov subspace method is also proposed. Numerical experiments show the effectiveness of relaxing the inner tolerance to save computational cost. 
    more » « less
  3. Randomized matrix algorithms have had significant recent impact on numerical linear algebra. One especially powerful class of methods are algorithms for approximate matrix multiplication based on sampling. Such methods typically sample individual matrix rows and columns using carefully chosen importance sampling probabilities. However, due to practical considerations like memory locality and the preservation of matrix structure, it is often preferable to sample contiguous blocks of rows and columns all together. Recently, (Wu, 2018) addressed this setting by developing an approximate matrix multiplication method based on block sampling. However, the method is inefficient, as it requires knowledge of optimal importance sampling probabilities that are expensive to compute. We address this issue by showing that the method of Wu can be accelerated through the use of a randomized implicit trace estimation method. Doing so allows us to provably reduce the cost of sampling to near-linear in the size of the matrices being multiplied, without impacting the accuracy of the final approximate matrix multiplication. Overall, this yields a fast practical algorithm, which we test on a number of synthetic and real-world data sets. We complement our algorithmic contribution with the first extensive empirical comparison of block algorithms for randomized matrix multiplication. Our method offers a significant runtime advantage over the method of (Wu, 2018) and also outperforms basic uniform sampling of blocks. However, we find another recent method of (Charalambides, 2021) which uses sub-optimal but efficiently computable sampling probabilities often (but not always) offers the best trade-off between speed and accuracy. 
    more » « less
  4. We propose a randomized algorithm with quadratic convergence rate for convex optimization problems with a self-concordant, composite, strongly convex objective function. Our method is based on performing an approximate Newton step using a random projection of the Hessian. Our first contribution is to show that, at each iteration, the embedding dimension (or sketch size) can be as small as the effective dimension of the Hessian matrix. Leveraging this novel fundamental result, we design an algorithm with a sketch size proportional to the effective dimension and which exhibits a quadratic rate of convergence. This result dramatically improves on the classical linear-quadratic convergence rates of state-of-theart sub-sampled Newton methods. However, in most practical cases, the effective dimension is not known beforehand, and this raises the question of how to pick a sketch size as small as the effective dimension while preserving a quadratic convergence rate. Our second and main contribution is thus to propose an adaptive sketch size algorithm with quadratic convergence rate and which does not require prior knowledge or estimation of the effective dimension: at each iteration, it starts with a small sketch size, and increases it until quadratic progress is achieved. Importantly, we show that the embedding dimension remains proportional to the effective dimension throughout the entire path and that our method achieves state-of-the-art computational complexity for solving convex optimization programs with a strongly convex component. We discuss and illustrate applications to linear and quadratic programming, as well as logistic regression and other generalized linear models. 
    more » « less
  5. ABSTRACT Several iterative soft‐thresholding algorithms, such as FISTA, have been proposed in the literature for solving regularized linear discrete inverse problems that arise in various applications in science and engineering. These algorithms are easy to implement, but their rates of convergence may be slow. This paper describes novel approaches to reduce the computations required for each iteration by using Krylov subspace techniques. Specifically, we propose to impose sparsity on the coefficients in the representation of the computed solution in terms of a Krylov subspace basis. Several numerical examples from image deblurring and computerized tomography are used to illustrate the efficiency and accuracy of the proposed methods. 
    more » « less