In cloud computing systems, elastic events and stragglers increase the uncertainty of the system, leading to computation delays. Coded elastic computing (CEC) introduced by Yang et al. in 2018 is a framework which mitigates the impact of elastic events using Maximum Distance Separable (MDS) coded storage. It proposed a CEC scheme for both matrix-vector multiplication and general matrix-matrix multiplication applications. However, in these applications, the proposed CEC scheme cannot tolerate stragglers due to the limitations imposed by MDS codes. In this paper we propose a new elastic computing scheme using uncoded storage and Lagrange coded computing approaches. The proposed scheme can effectively mitigate the effects of both elasticity and stragglers. Moreover, it produces a lower complexity and smaller recovery threshold compared to existing coded storage based schemes.
more »
« less
This content will become publicly available on July 12, 2026
Linear exact repair schemes for free MDS and Reed–Solomon codes over Galois rings
Codes over rings, especially over Galois rings, have been extensively studied for nearly three decades due to their similarity to linear codes over finite fields. A distributed storage system uses a linear code to encode a large file across several nodes. If one of the nodes fails, a linear exact repair scheme efficiently recovers the failed node by accessing and downloading data from the rest of the servers of the storage system. In this paper, we develop a linear repair scheme for free maximum distance separable codes, which coincide with free maximum distance with respect to the rank codes over Galois rings. In particular, we give a linear repair scheme for full-length Reed–Solomon codes over a Galois ring.
more »
« less
- PAR ID:
- 10615697
- Publisher / Repository:
- World Scientific Connect
- Date Published:
- Journal Name:
- Journal of Algebra and Its Applications
- ISSN:
- 0219-4988
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We consider the storage–retrieval rate trade-off in private information retrieval (PIR) systems using a Shannon-theoretic approach. Our focus is mostly on the canonical two-message two-database case, for which a coding scheme based on random codebook generation and the binning technique is proposed. This coding scheme reveals a hidden connection between PIR and the classic multiple description source coding problem. We first show that when the retrieval rate is kept optimal, the proposed non-linear scheme can achieve better performance over any linear scheme. Moreover, a non-trivial storage-retrieval rate trade-off can be achieved beyond space-sharing between this extreme point and the other optimal extreme point, achieved by the retrieve-everything strategy. We further show that with a method akin to the expurgation technique, one can extract a zero-error PIR code from the random code. Outer bounds are also studied and compared to establish the superiority of the non-linear codes over linear codes.more » « less
-
Cloud storage systems generally add redundancy in storing content files such that K files are replicated or erasure coded and stored on N > K nodes. In addition to providing reliability against failures, the redundant copies can be used to serve a larger volume of content access requests. A request for one of the files can be either be sent to a systematic node, or one of the repair groups. In this paper, we seek to maximize the service capacity region, that is, the set of request arrival rates for the K files that can be supported by a coded storage system. We explore two aspects of this problem: 1) for a given erasure code, how to optimally split incoming requests between systematic nodes and repair groups, and 2) choosing an underlying erasure code that maximizes the achievable service capacity region. In particular, we consider MDS and Simplex codes. Our analysis demonstrates that erasure coding makes the system more robust to skews in file popularity than simply replicating a file at multiple servers, and that coding and replication together can make the capacity region larger than either alone.more » « less
-
null (Ed.)Generalized low-density parity-check (GLDPC) codes, where the single parity-check (SPC) nodes are replaced by generalized constraint (GC) nodes, are known to offer a reduced gap to capacity when compared with conventional LDPC codes, while also maintaining linear growth of minimum distance. However, for certain classes of practical GLDPC codes, there remains a gap to capacity even when utilizing blockwise decoding algorithm at GC nodes. In this work, we propose to optimize the design of GLDPC codes where the GC nodes are decoded with a trellis-based bit-wise Bahl-Cocke-Jelinek- Raviv (BCJR) component decoding algorithm. We analyze the asymptotic threshold behavior of GLDPC codes and determine the optimal proportion of the GC nodes in the GLDPC Tanner graph.We show significant performance improvements compared to existing designs with the same order of decoding complexity.more » « less
-
We consider the problem of coded distributed computing where a large linear computational job, such as a matrix multiplication, is divided into $$k$$ smaller tasks, encoded using an $(n,k)$ linear code, and performed over $$n$$ distributed nodes. The goal is to reduce the average execution time of the computational job. We provide a connection between the problem of characterizing the average execution time of a coded distributed computing system and the problem of analyzing the error probability of codes of length $$n$$ used over erasure channels. Accordingly, we present closed-form expressions for the execution time using binary random linear codes and the best execution time any linear-coded distributed computing system can achieve. It is also shown that there exist good binary linear codes that attain, asymptotically, the best performance any linear code, not necessarily binary, can achieve. We also investigate the performance of coded distributed computing systems using polar and Reed-Muller (RM) codes that can benefit from low-complexity decoding, and superior performance, respectively, as well as explicit constructions. The proposed framework in this paper can enable efficient designs of distributed computing systems given the rich literature in the channel coding theory.more » « less
An official website of the United States government
