Flash and non-volatile memory (NVM) devices have only a limited number of write-erase cycles. Consequently, when employed as caches, cache management policies may choose not to cache certain requested items in order to extend device lifespan. In this work, we propose a simple single-parameter utility function to model the trade-off between maximizing hit-rate and minimizing write-erase cycles for such caches, and study the problem of developing an off-line strategy for deciding whether to write a new item to cache, and if so which item already in the cache to replace. Our main result is, , an efficient, network flow based algorithm which finds optimal cache management policy under this new setting.
more »
« less
Baleen: ML Admission & Prefetching for Flash Caches
Flash caches are used to reduce peak backend load for throughput-constrained data center services, reducing the total number of backend servers required. Bulk storage systems are a large-scale example, backed by high-capacity but low-throughput hard disks, and using flash caches to provide a more cost-effective storage layer underlying everything from blobstores to data warehouses. However, flash caches must address the limited write endurance of flash by limiting the long-term average flash write rate to avoid premature wearout. To do so, most flash caches must use admission policies to filter cache insertions and maximize the workload-reduction value of each flash write. The Baleen flash cache uses coordinated ML admission and prefetching to reduce peak backend load. After learning painful lessons with our early ML policy attempts, we exploit a new cache residency model (which we call episodes) to guide model training. We focus on optimizing for an end-to-end system metric (Disk-head Time) that measures backend load more accurately than IO miss rate or byte miss rate. Evaluation using Meta traces from seven storage clusters shows that Baleen reduces Peak Disk-head Time (and hence the number of backend hard disks required) by 12% over state-of-the-art policies for a fixed flash write rate constraint. Baleen-TCO, which chooses an optimal flash write rate, reduces our estimated total cost of ownership (TCO) by 17%. Code and traces are available at https://www.pdl.cmu.edu/CILES/.
more »
« less
- Award ID(s):
- 2150217
- PAR ID:
- 10574914
- Publisher / Repository:
- Usenix
- Date Published:
- ISBN:
- 978-1-939133-38-0
- Format(s):
- Medium: X
- Location:
- Santa Clara, CA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Modern solid-state disks achieve high data transfer rates due to their massive internal parallelism. However, out-of-place updates for flash memory incur garbage collection costs when valid data needs to be copied during space reclamation. The root cause of this extra cost is that solid-state disks are not always able to accurately determine data lifetime and group together data that expires before the space needs to be reclaimed. Real-time systems found in autonomous vehicles, industrial control systems, and assembly-line robots store data from hundreds of sensors and often have predictable data lifetimes. These systems require guaranteed high storage bandwidth for read and write operations by mission-critical real-time tasks. In this article, we depart from the traditional block device interface to guarantee the high throughput needed to process large volumes of data. Using data lifetime information from the application layer, our proposed real-time design, called Telomere , is able to intelligently lay out data in NAND flash memory and eliminate valid page copies during garbage collection. Telomere’s real-time admission control is able to guarantee tasks their required read and write operations within their periods. Under randomly generated tasksets containing 500 tasks, Telomere achieves 30% higher throughput with a 5% storage cost compared to pre-existing techniques.more » « less
-
With the increasing dominance of SSDs for local storage, today's network mounted virtual disks can no longer offer competitive performance. We propose a Log-Structured Virtual Disk (LSVD) that couples log-structured approaches at both the cache and storage layer to provide a virtual disk on top of S3-like storage. Both cache and backend store are order-preserving, enabling LSVD to provide strong consistency guarantees in case of failure. Our prototype demonstrates that the approach preserves all the advantages of virtual disks, while offering dramatic performance improvements over not only commonly used virtual disks, but the same disks combined with inconsistent (i.e. unsafe) local caching.more » « less
-
With the increasing dominance of SSDs for local storage, today's network mounted virtual disks can no longer offer competitive performance. We propose a Log-Structured Virtual Disk (LSVD) that couples log-structured approaches at both the cache and storage layer to provide a virtual disk on top of S3-like storage. Both cache and backend store are order-preserving, enabling LSVD to provide strong consistency guarantees in case of failure. Our prototype demonstrates that the approach preserves all the advantages of virtual disks, while offering dramatic performance improvements over not only commonly used virtual disks, but the same disks combined with inconsistent (i.e. unsafe) local caching.more » « less
-
We propose a decentralized spatial soft-core cache placement (SSCC) policy for wireless networks. SSCC yields a spatially balanced sampling via negative dependence across caches, and can be tuned to satisfy cache size constraints with high probability. Given a desired cache hit probability, we compare the 95% confidence intervals of the required cache sizes for independent placement, hard-core placement and SSCC policies. We demonstrate that in terms of the required cache storage size, SSCC can provide up to more than 180% and 100% gains with respect to the independent and hard-core placement policies, respectively. SSCC can be used to enable proximity- based applications such as device-to-device communications and peer-to-peer networking as it promotes the item diversity and reciprocation among the nodes.more » « less
An official website of the United States government

