skip to main content


Title: Reprogramming 3D TLC Flash Memory based Solid State Drives
NAND flash memory-based SSDs have been widely adopted. The scaling of SSD has evolved from plannar (2D) to 3D stacking. For reliability and other reasons, the technology node in 3D NAND SSD is larger than in 2D, but data density can be increased via increasing bit-per-cell. In this work, we develop a novel reprogramming scheme for TLCs in 3D NAND SSD, such that a cell can be programmed and reprogrammed several times before it is erased. Such reprogramming can improve the endurance of a cell and the speed of programming, and increase the amount of bits written in a cell per program/erase cycle, i.e., effective capacity. Our work is the first to perform a real 3D NAND SSD test to validate the feasibility of the reprogram operation. From the collected data, we derive the restrictions of performing reprogramming due to reliability challenges. Furthermore, a reprogrammable SSD (ReSSD) is designed to structure reprogram operations. ReSSD is evaluated in a case study in RAID 5 system (RSS-RAID). Experimental results show that RSS-RAID can improve the endurance by 35.7%, boost write performance by 15.9%, and increase effective capacity by 7.71%, with negligible overhead compared with conventional 3D SSD-based RAID 5 system.  more » « less
Award ID(s):
2011146 1910413 1725657 1718080
PAR ID:
10316354
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
ACM Transactions on Storage
Volume:
18
Issue:
1
ISSN:
1553-3077
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Traditional RAID solutions (e.g., Linux MD) balance writes evenly across the array for high I/O parallelism and data reliability. This is built around the assumption that the underlying storage components are homogeneous, both in performance and capacity. However, SSDs, even for the same model, exhibit very different characteristics and degrade over time, leading to severe disk under-utilization. In this work, we present Asymmetric-RAID (Asym-RAID), a novel RAID architecture that optimizes system performance and storage utilization by exploiting heterogeneity from a larger SSD pool. Asym-RAID asymmetrically distributes data across the array to fully utilize the capacity of each SSD. To improve performance, Asym-RAID differentially exports the address space of each data stripe to the host, allowing for performance-optimized data placement. We outline the necessary changes in the storage stack for building an asymmetric RAID system and highlight its benefits. 
    more » « less
  2. This paper studies how RAID (redundant array of independent disks) could take full advantage of modern SSDs (solid-state drives) with built-in transparent compression. In current practice, RAID users are forced to choose a specific RAID level (e.g., RAID 10 or RAID 5) with a fixed storage cost vs. speed performance trade-off. The commercial market is witnessing the emergence of a new family of SSDs that can internally perform hardware-based lossless compression on each 4KB LBA (logical block address) block, transparent to host OS and user applications. Beyond straightforwardly reducing the RAID storage cost, such modern SSDs make it possible to relieve RAID users from being locked into a fixed storage cost vs. speed performance trade-off. In particular, RAID systems could opportunistically leverage higher-than-expected runtime user data compressibility to enable dynamic RAID level conversion to improve the speed performance without compromising the effective storage capacity. This paper presents techniques to enable and optimize the practical implementation of such elastic RAID systems. We implemented a Linux software-based elastic RAID prototype that supports dynamic conversion between RAID 5 and RAID 10. Compared with a baseline software-based RAID 5, under sufficient runtime data compressibility that enables the conversion from RAID 5 to RAID 10 over 60% of user data, the elastic RAID could improve the 4KB random write IOPS (I/O per second) by 42% and 4KB random read IOPS in degraded mode by 46%, while maintaining the same effective storage capacity. 
    more » « less
  3. NAND flash-based Solid State Devices (SSDs) offer the desirable features of high performance, energy efficiency, and fast growing capacity. Thus, the use of SSDs is increasing in distributed storage systems. A key obstacle in this context is that the natural unbalance in distributed I/O workloads can result in wear imbalance across the SSDs in a distributed setting. This, in turn can have significant impact on the reliability, performance, and lifetime of the storage deployment. Extant load balancers for storage systems do not consider SSD wear imbalance when placing data, as the main design goal of such balancers is to extract higher performance. Consequently, data migration is the only common technique for tackling wear imbalance, where existing data is moved from highly loaded servers to the least loaded ones. In this paper, we explore an innovative holistic approach, Chameleon, that employs data redundancy techniques such as replication and erasure-coding, coupled with endurance-aware write offloading, to mitigate wear level imbalance in distributed SSD-based storage. Chameleon aims to balance the wear among different flash servers while meeting desirable objectives of: extending life of flash servers; improving I/O performance; and avoiding bottlenecks. Evaluation with a 50 node SSD cluster shows that Chameleon reduces the wear distribution deviation by 81% while improving the write performance by up to 33%. 
    more » « less
  4. In the US alone, data centers consumed around $20 billion (200 TWh) yearly electricity in 2016, and this amount doubles itself every five years. Data storage alone is estimated to be responsible for about 25% to 35% of data-center power consumption. Servers in data centers generally include multiple HDDs or SSDs, commonly arranged in a RAID level for better performance, reliability, and availability. In this study, we evaluate HDD and SSD based Linux (md) software RAIDs' impact on the energy consumption of popular servers. We used the Filebench workload generator to emulate three common server workloads: web, file, and mail, and measured the energy consumption of the system using the HOBO power meter. We observed some similarities and some differences in energy consumption characteristics of HDD and SSD RAIDs, and provided our insights for better energy-efficiency. We hope that our observations will shed light on new energy-efficient RAID designs tailored for HDD and SSD RAIDs' specific energy consumption characteristics. 
    more » « less
  5. Abstract

    2D memristors have demonstrated attractive resistive switching characteristics recently but also suffer from the reliability issue, which limits practical applications. Previous efforts on 2D memristors have primarily focused on exploring new material systems, while damage from the metallization step remains a practical concern for the reliability of 2D memristors. Here, the impact of metallization conditions and the thickness of MoS2films on the reliability and other device metrics of MoS2‐based memristors is carefully studied. The statistical electrical measurements show that the reliability can be improved to 92% for yield and improved by ≈16× for average DC cycling endurance in the devices by reducing the top electrode (TE) deposition rate and increasing the thickness of MoS2films. Intriguing convergence of switching voltages and resistance ratio is revealed by the statistical analysis of experimental switching cycles. An “effective switching layer” model compatible with both monolayer and few‐layer MoS2, is proposed to understand the reliability improvement related to the optimization of fabrication configuration and the convergence of switching metrics. The Monte Carlo simulations help illustrate the underlying physics of endurance failure associated with cluster formation and provide additional insight into endurance improvement with device fabrication optimization.

     
    more » « less