Flash caches are used to reduce peak backend load for throughput-constrained data center services, reducing the total number of backend servers required. Bulk storage systems are a large-scale example, backed by high-capacity but low-throughput hard disks, and using flash caches to provide a more cost-effective storage layer underlying everything from blobstores to data warehouses. However, flash caches must address the limited write endurance of flash by limiting the long-term average flash write rate to avoid premature wearout. To do so, most flash caches must use admission policies to filter cache insertions and maximize the workload-reduction value of each flash write. The Baleen flash cache uses coordinated ML admission and prefetching to reduce peak backend load. After learning painful lessons with our early ML policy attempts, we exploit a new cache residency model (which we call episodes) to guide model training. We focus on optimizing for an end-to-end system metric (Disk-head Time) that measures backend load more accurately than IO miss rate or byte miss rate. Evaluation using Meta traces from seven storage clusters shows that Baleen reduces Peak Disk-head Time (and hence the number of backend hard disks required) by 12% over state-of-the-art policies for a fixed flash write rate constraint. Baleen-TCO, which chooses an optimal flash write rate, reduces our estimated total cost of ownership (TCO) by 17%. Code and traces are available at https://www.pdl.cmu.edu/CILES/.
more »
« less
Finding optimal non-datapath caching strategies via network flow
Flash and non-volatile memory (NVM) devices have only a limited number of write-erase cycles. Consequently, when employed as caches, cache management policies may choose not to cache certain requested items in order to extend device lifespan. In this work, we propose a simple single-parameter utility function to model the trade-off between maximizing hit-rate and minimizing write-erase cycles for such caches, and study the problem of developing an off-line strategy for deciding whether to write a new item to cache, and if so which item already in the cache to replace. Our main result is, , an efficient, network flow based algorithm which finds optimal cache management policy under this new setting.
more »
« less
- Award ID(s):
- 1956229
- PAR ID:
- 10413704
- Date Published:
- Journal Name:
- Theoretical computer science
- Volume:
- 945
- ISSN:
- 0304-3975
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In this paper, we investigate how to soundly analyze multi-level caches that employ write-back policy at each level for worst-case execution time (WCET) estimation. To the best of our knowledge, there is only one existing approach for dealing with write backs in multi-level cache analysis. However, as shown in the paper, this existing approach is not sound. In order to soundly handle write backs, at a cache level, we need to consider whether a memory block is potentially dirty and when such a potentially dirty block may be evicted from the cache. To this end, we introduce a dirty attribute into persistence analysis for tracking dirty blocks, and over-approximate a write back window for each possible write back. Based on the overestimated write back occurring times, we propose an approach that can soundly deal with write backs in analysis of multi-level (unified) caches for WCET estimation. Possible write back costs are also integrated into path analysis. We evaluate the proposed approach on a set of benchmarks to demonstrate its effectiveness.more » « less
-
We present an experimental thermal memory with direct optical control and readout. Information is stored in the internal temperature of the device, while laser illumination is used to read, write, and erase stored bits. Our design is based on an absorptive optical resonance in a silicon photonic crystal slab. When the slab is illuminated by a laser with a wavelength close to the resonance, the optical absorption is nonlinear with power, resulting in thermo-optic bistability. We experimentally demonstrate bistability in a fabricated device and show the reading, writing, and erasing of a single memory bit. A hybrid optothermal model shows good agreement with the experiment. Time dependent measurements show that the experimental write/erase times are less than 500 µs. We demonstrate that memory reliability is maintained over 106 cycles, with less than 3% change in the transmission values for the memory ON and OFF states. Our approach allows operation in high temperature and/or highly fluctuating temperature environment up to 100 °C or greater.more » « less
-
Stateful serverless workflows consist of multiple serverless functions that access state on a remote database. Developers sometimes add a cache layer between the serverless runtime and the database to improve I/O latency. However, in a serverless environment, functions in the same workflow may be scheduled to different nodes with different caches, which can cause non-intuitive anomalies. This paper presents CausalMesh, a novel approach to causally consistent caching in serverless computing. CausalMesh is the first cache system that supports coordination-free and abort-free read-/write operations and read transactions when clients roam among multiple servers. CausalMesh also supports read-write transactional causal consistency in the presence of client roaming, but at the cost of abort-freedom. Our evaluation shows that CausalMesh has lower latency and higher throughput than existing proposals.more » « less
-
Stateful serverless workflows consist of multiple serverless functions that access state on a remote database. Developers sometimes add a cache layer between the serverless runtime and the database to improve I/O latency. However, in a serverless environment, functions in the same workflow may be scheduled to different nodes with different caches, which can cause non-intuitive anomalies. This paper presents CausalMesh, a novel approach to causally consistent caching in serverless computing. CausalMesh is the first cache system that supports coordination-free and abort-free read/write operations and read transactions when clients roam among multiple servers. CausalMesh also supports read-write transactional causal consistency in the presence of client roaming, but at the cost of abort-freedom. Our evaluation shows that CausalMesh has lower latency and higher throughput than existing proposals.more » « less
An official website of the United States government

