skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Take Out the TraChe: Maximizing (Tra)nsactional Ca(che) Hit Rate
Most caching policies focus on increasing object hit rate to improve overall system performance. However, these algorithms are insufficient for transactions. In this work, we define a new metric, transactional hit rate, to capture when caching reduces latency for transactions. We present DeToX, a caching system that leverages transactional dependencies to make eviction and prefetching decisions. DeToX is able to significantly outperform single-object alternatives on real-world workloads and popular OLTP benchmarks, providing up to a 130% increase in transaction hit rate and 3.4x improvement in cache efficiency.  more » « less
Award ID(s):
1730628
PAR ID:
10523931
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
USENIX Association
Date Published:
ISBN:
978-1-939133-34-2
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Traditional caching models emphasize hit rate as the principal measure of performance for cache replacement algorithms. However, hit rate alone can be misleading in the presence of a phenomenon known as a delayed hit. Delayed hits occur in high-throughput systems when multiple requests for an object accumulate before the object can be fetched from the backing store. Prior work by Atre et al. has explored the impact of delayed hits in simple caching scenarios, namely single-tier caches with uniform object sizes. In this work we seek to extend that investigation to consider multi-level caches, such as those that might be found in a modern CDN. Furthermore, we extend MAD, the delayed-hits-aware policy proposed by Atre et al, so that it can be deployed in a multi-tier caching system. We evaluate the performance of MAD using a multi-tier cache simulator and an empirical cache configuration based on modern CDNs. Our initial results lead us to believe that delayed hits can still be a prominent factor in the performance of multi-level caches, although their effect may be reduced in comparison to simpler cache configurations. 
    more » « less
  2. Transactional memory has been receiving much attention from both academia and industry. In transactional memory, program code is split into transactions, blocks of code that appear to execute atomically. Transactions are executed speculatively and the speculative execution is supported through data versioning mechanism. Lazy versioning makes aborts fast but penalizes commits, whereas eager versioning makes commits fast but penalizes aborts. However, whether to use eager or lazy versioning to execute those transactions is still a hotly debated topic. Lazy versioning seems appropriate for write-dominated workloads and transactions in high contention scenarios whereas eager versioning seems appropriate for read-dominated workloads and transactions in low contention scenarios. This necessitates a priori knowledge on the workload and contention scenario to select an appropriate versioning method to achieve better performance. In this article, we present an adaptive versioning approach, called Adaptive, that dynamically switches between eager and lazy versioning at runtime, without the need of a priori knowledge on the workload and contention scenario but based on appropriate system parameters, so that the performance of a transactional memory system is always better than that is obtained using either eager or lazy versioning individually. We provide Adaptive for both persistent and non-persistent transactional memory systems using performance parameters appropriate for those systems. We implemented our adaptive versioning approach in the latest software transactional memory distribution TinySTM and extensively evaluated it through 5 micro-benchmarks and 8 complex benchmarks from STAMP and STAMPEDE suites. The results show significant benefits of our approach. Specifically, in persistent TM systems, our approach achieved performance improvements as much as 1.5× for execution time and as much as 240× for number of aborts, whereas our approach achieved performance improvements as much as 6.3× for execution time and as much as 170× for number of aborts in non-persistent transactional memory systems. 
    more » « less
  3. null (Ed.)
    Read-only transactions are critical for consistently reading data spread across a distributed storage system but have worse performance than simple, non-transactional reads. We identify three properties of simple reads that are necessary for read-only transactions to be performance-optimal, i.e.,come as close as possible to simple reads. We demonstrate a fundamental tradeoff in the design of read-only transactions by proving that performance optimality is impossible to achieve with strict serializability, the strongest consistency.Guided by this result, we present PORT, a performance-optimal design with the strongest consistency to date. Central to PORT are version clocks, a specialized logical clock that concisely captures the necessary ordering constraints.We show the generality of PORT with two applications.Scylla-PORT provides process-ordered serializability with simple writes and shows performance comparable to its non-transactional base system. Eiger-PORT provides causal consistency with write transactions and significantly improves the performance of its transactional base system. 
    more » « less
  4. Content delivery networks (CDNs) distribute much of the Internet content by caching and serving the objects requested by users. A major goal of a CDN is to maximize the hit rates of its caches, thereby enabling faster content downloads to the users. Content caching involves two components: an admission algorithm to decide whether to cache an object and an eviction algorithm to decide which object to evict from the cache when it is full. In this paper, we focus on cache admission and propose a novel algorithm called RL-Cache that uses model-free reinforcement learning (RL) to decide whether or not to admit a requested object into the CDN's cache. Unlike prior approaches that use a small set of criteria for decision making, RL-Cache weights a large set of features that include the object size, recency, and frequency of access. We develop a publicly available implementation of RL-Cache and perform an evaluation using production traces for the image, video, and web traffic classes from Akamai's CDN. The evaluation shows that RL-Cache improves the hit rate in comparison with the state of the art and imposes only a modest resource overhead on the CDN servers. Further, RL-Cache is robust enough that it can be trained in one location and executed on request traces of the same or different traffic classes in other locations of the same geographic region. 
    more » « less
  5. Caches are at the heart of latency-sensitive systems. In this paper, we identify a growing challenge for the design of latency-minimizing caches called delayed hits. Delayed hits occur at high throughput, when multiple requests to the same object queue up before an outstanding cache miss is resolved. This effect increases latencies beyond the predictions of traditional caching models and simulations; in fact, caching algorithms are designed as if delayed hits simply didn't exist. We show that traditional caching strategies -- even so called 'optimal' algorithms -- can fail to minimize latency in the presence of delayed hits. We design a new, latency-optimal offline caching algorithm called belatedly which reduces average latencies by up to 45% compared to the traditional, hit-rate optimal Belady's algorithm. Using belatedly as our guide, we show that incorporating an object's 'aggregate delay' into online caching heuristics can improve latencies for practical caching systems by up to 40%. We implement a prototype, Minimum-AggregateDelay (mad), within a CDN caching node. Using a CDN production trace and backends deployed in different geographic locations, we show that mad can reduce latencies by 12-18% depending on the backend RTTs. 
    more » « less