skip to main content


Title: Asymmetric LOCO Codes: Constrained Codes for Flash Memories
In data storage and data transmission, certain patterns are more likely to be subject to error when written (transmitted) onto the media. In magnetic recording systems with binary data and bipolar non-return-to-zero signaling, patterns that have insufficient separation between consecutive transitions exacerbate inter-symbol interference. Constrained codes are used to eliminate such error-prone patterns. A recent example is a new family of capacity-achieving constrained codes, named lexicographically-ordered constrained codes (LOCO codes). LOCO codes are symmetric, that is, the set of forbidden patterns is closed under taking pattern complements. LOCO codes are suboptimal in terms of rate when used in Flash devices where block erasure is employed since the complement of an error-prone pattern is not detrimental in these devices. This paper introduces asymmetric LOCO codes (A-LOCO codes), which are lexicographically-ordered constrained codes that forbid only those patterns that are detrimental for Flash performance. A-LOCO codes are also capacity-achieving, and at finite-lengths, they offer higher rates than the available state-of-the-art constrained codes designed for the same goal. The mapping-demapping between the index and the codeword in A-LOCO codes allows low-complexity encoding and decoding algorithms that are simpler than their LOCO counterparts.  more » « less
Award ID(s):
1717602
NSF-PAR ID:
10191392
Author(s) / Creator(s):
;
Date Published:
Journal Name:
57th Annual Allerton Conference on Communications, Control and Computing
Page Range / eLocation ID:
124 to 131
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Flash memory devices are winning the competition for storage density against magnetic recording devices. This outcome results from advances in physics that allow storage of more than one bit per cell, coupled with advances in signal processing that reduce the effect of physical instabilities. Constrained codes are used in storage to avoid problematic patterns. Recently, we introduced binary symmetric lexicographically-ordered constrained codes (LOCO codes) for data storage and transmission. This paper introduces simple constrained codes that support non-binary physical gates in multi, triple, quad, and the currently-in-development penta-level cell (M/T/Q/P-LC) Flash memories. The new codes can be easily modified if problematic patterns change with time. These codes are designed to mitigate inter-cell interference, which is a critical source of error in Flash devices. The new codes are called q-ary asymmetric LOCO codes (QA-LOCO codes), and the construction subsumes codes previously designed for single-level cell (SLC) Flash devices (ALOCO codes). QA-LOCO codes work for a Flash device with any number, q, of levels per cell. For q ≥ 4, we show that QA-LOCO codes can achieve rates greater than 0.95log 2 q information bits per coded symbol. Capacity-achieving rates, affordable encoding-decoding complexity, and ease of reconfigurability support the growing improvement of M/T/Q/P-LC Flash memory devices, as well as lifecycle management as the characteristics of these devices change with time. 
    more » « less
  2. In order to meet the demands of data-hungry applications, data storage devices are required to be increasingly denser. Various sources of error appear with this increase in density. Multi-dimensional (MD) graph-based codes are capable of mitigating error sources like interference and channel non-uniformity in dense storage devices. Recently, a technique was proposed to enhance the performance of MD spatially-coupled codes that are based on circulants. The technique carefully relocates circulants to minimize the number of short cycles. However, cycles become more detrimental when they combine together to form more advanced objects, e.g., absorbing sets, including low-weight codewords. In this paper, we show how MD relocations can be exploited to minimize the number of detrimental objects in the graph of an MD code. Moreover, we demonstrate the savings in the number of relocation arrangements earned by focusing on objects rather than cycles. Our technique is applicable to a wide variety of one-dimensional (OD) codes. Simulation results reveal significant lifetime gains in practical Flash systems achieved by MD codes designed using our technique compared with OD codes having similar parameters. 
    more » « less
  3. In this paper, we consider the equivocation of finite blocklength coset codes when used over binary erasure wiretap channels. We make use of the equivocation matrix in comparing codes that are suitable for scenarios with noisy channels for both the intended receiver and an eavesdropper. Equivocation matrices have been studied in the past only for the binary erasure wiretap channel model with a noiseless channel for the intended recipient. In that case, an exact relationship between the elements of equivocation matrices for a code and its dual code was identified. The majority of work on coset codes for wiretap channels only addresses the noise-free main channel case, and extensions to noisy main channels require multi-edge type codes. In this paper, we supply a more insightful proof for the noiseless main channel case, and identify a new dual relationship that applies when two-edge type coset codes are used for the noisy main channel case. The end result is that the elements of the equivocation matrix for a dual code are known precisely from the equivocation matrix of the original code according to fixed reordering patterns. Such relationships allow one to study the equivocation of codes and their duals in tandem, which simplifies the search for best and/or good finite blocklength codes. This paper is the first work that succinctly links the equivocation/error correction capabilities of dual codes for two-edge type coset coding over erasure-prone main channels. 
    more » « less
  4. As the number, size and complexity of building construction projects increase, code compliance checking becomes more challenging because of the time-consuming, costly, and error-prone nature of a manual checking process. A fully automated code compliance checking would be desirable in facilitating a more efficient, cost effective, and human error-proof code checking. Such automation requires automated information extraction from building designs and building codes, and automated information transformation to a format that allows automated reasoning. Natural Language Processing (NLP) is an important technology to support such automated processing of building codes, because building codes are represented in natural language texts. Part-of-speech (POS) tagging, as an important basis of NLP tasks, must have a high performance to ensure the quality of the automated processing of building codes in such a compliance checking system. However, no systematic testing of existing POS taggers on domain specific building codes data have been performed. To address this gap, the authors analyzed the performance of seven state-of-the-at POS taggers on tagging building codes and compared their results to a manually-labeled gold standard. The authors aim to: (1) find the best performing tagger in terms of accuracy, and (2) identify common sources of errors. In providing the POS tags, the authors used the Penn Treebank tagset, which is a widely used tagset with a proper balance between conciseness and information richness. An average accuracy of 88.80% was found on the testing data. The Standford coreNLP tagger outperformed the other taggers in the experiment. Common sources of errors were identified to be: (1) word ambiguity, (2) rare words, and (3) unique meaning of common English words in the construction context. The found result of machine taggers on building codes calls for performance improvement, such as error-fixing transformational rules and machine taggers that are trained on building codes. 
    more » « less
  5. In general, the generator matrix sparsity is a critical factor in determining the encoding complexity of a linear code. Further, certain applications, e.g., distributed crowdsourcing schemes utilizing linear codes, require most or even all the columns of the generator matrix to have some degree of sparsity. In this paper, we leverage polar codes and the well-established channel polarization to design capacity-achieving codes with a certain constraint on the weights of all the columns in the generator matrix (GM) while having a low-complexity decoding algorithm. We first show that given a binary-input memoryless symmetric (BMS) channel $W$ and a constant $s \in (0, 1]$ , there exists a polarization kernel such that the corresponding polar code is capacity-achieving with the rate of polarization $s/2$ , and the GM column weights being bounded from above by $N^{s}$ . To improve the sparsity versus error rate trade-off, we devise a column-splitting algorithm and two coding schemes for BEC and then for general BMS channels. The polar-based codes generated by the two schemes inherit several fundamental properties of polar codes with the original $2 \times 2$ kernel including the decay in error probability, decoding complexity, and the capacity-achieving property. Furthermore, they demonstrate the additional property that their GM column weights are bounded from above sublinearly in $N$ , while the original polar codes have some column weights that are linear in $N$ . In particular, for any BEC and $\beta < 0.5$ , the existence of a sequence of capacity-achieving polar-based codes where all the GM column weights are bounded from above by $N^{\lambda} $ with $\lambda \approx 0.585$ , and with the error probability bounded by ${\mathcal {O}}(2^{-N^{\beta }})$ under a decoder with complexity ${\mathcal {O}}(N\log N)$ , is shown. The existence of similar capacity-achieving polar-based codes with the same decoding complexity is shown for any BMS channel and $\beta < 0.5$ with $\lambda \approx 0.631$ . 
    more » « less