skip to main content


Title: RL-Cache: Learning-Based Cache Admission for Content Delivery
Content delivery networks (CDNs) distribute much of the Internet content by caching and serving the objects requested by users. A major goal of a CDN is to maximize the hit rates of its caches, thereby enabling faster content downloads to the users. Content caching involves two components: an admission algorithm to decide whether to cache an object and an eviction algorithm to decide which object to evict from the cache when it is full. In this paper, we focus on cache admission and propose a novel algorithm called RL-Cache that uses model-free reinforcement learning (RL) to decide whether or not to admit a requested object into the CDN's cache. Unlike prior approaches that use a small set of criteria for decision making, RL-Cache weights a large set of features that include the object size, recency, and frequency of access. We develop a publicly available implementation of RL-Cache and perform an evaluation using production traces for the image, video, and web traffic classes from Akamai's CDN. The evaluation shows that RL-Cache improves the hit rate in comparison with the state of the art and imposes only a modest resource overhead on the CDN servers. Further, RL-Cache is robust enough that it can be trained in one location and executed on request traces of the same or different traffic classes in other locations of the same geographic region.  more » « less
Award ID(s):
1763617 1717179
NSF-PAR ID:
10173204
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
NetAI'19: Proceedings of the 2019 Workshop on Network Meets AI & ML
Page Range / eLocation ID:
57 to 63
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Content delivery networks (CDNs) distribute much of the Internet content by caching and serving the objects requested by users. A major goal of a CDN is to maximize the hit rates of its caches, thereby enabling faster content downloads to the users. Content caching involves two components: an admission algorithm to decide whether to cache an object and an eviction algorithm to determine which object to evict from the cache when it is full. In this paper, we focus on cache admission and propose a novel algorithm called RL-Cache that uses model-free reinforcement learning (RL) to decide whether or not to admit a requested object into the CDN’s cache. Unlike prior approaches that use a small set of criteria for decision making, RL-Cache weights a large set of features that include the object size, recency, and frequency of access. We develop a publicly available implementation of RL-Cache and perform an evaluation using production traces for the image, video, and web traffic classes from Akamai’s CDN. The evaluation shows that RL-Cache improves the hit rate in comparison with the state of the art and imposes only a modest resource overhead on the CDN servers. Further, RL-Cache is robust enough that it can be trained in one location and executed on request traces of the same or different traffic classes in other locations of the same geographic region. The paper also reports extensive analyses of the RL-Cache sensitivity to its features and hyperparameter values. The analyses validate the made design choices and reveal interesting insights into the RL-Cache behavior. 
    more » « less
  2. Content delivery networks (CDNs) distribute much of today's Internet traffic by caching and serving users' contents requested. A major goal of a CDN is to improve hit probabilities of its caches, thereby reducing WAN traffic and user-perceived latency. In this paper, we develop a new approach for caching in CDNs that learns from optimal caching for decision making. To attain this goal, we first propose HRO to compute the upper bound on optimal caching in an online manner, and then leverage HRO to inform future content admission and eviction. We call this new cache design LHR. We show that LHR is efficient since it includes a detection mechanism for model update, an auto-tuned threshold-based model for content admission with a simple eviction rule. We have implemented an LHR simulator as well as a prototype within an Apache Traffic Server and the Caffeine, respectively. Our experimental results using four production CDN traces show that LHR consistently outperforms state of the arts with an increase in hit probability of up to 9% and a reduction in WAN traffic of up to 15% compared to a typical production CDN cache. Our evaluation of the LHR prototype shows that it only imposes a moderate overhead and can be deployed on today's CDN servers. 
    more » « less
  3. Traces from production caching systems of users accessing con- tent are seldom made available to the public as they are considered private and proprietary. The dearth of realistic trace data makes it difficult for system designers and researchers to test and validate new caching algorithms and architectures. To address this key problem, we present TRAGEN, a tool that can generate a synthetic trace that is “similar” to an original trace from the production system in the sense that the two traces would result in similar hit rates in a cache simulation. We validate TRAGEN by first proving that the synthetic trace is similar to the original trace for caches of arbitrary size when the Least-Recently-Used (LRU) policy is used. Next, we empirically validate the similarity of the synthetic trace and original trace for caches that use a broad set of commonly-used caching policies that include LRU, SLRU, FIFO, RANDOM, MARKERS, CLOCK and PLRU. For our empirical validation, we use original request traces drawn from four different traffic classes from the world’s largest CDN, each trace consisting of hundreds of millions of requests for tens of millions of objects. TRAGEN is publicly available and can be used to generate synthetic traces that are similar to actual pro- duction traces for a number of traffic classes such as videos, social media, web, and software downloads. Since the synthetic traces are similar to the original production ones, cache simulations performed using the synthetic traces will yield similar results to what might be attained in a production setting, making TRAGEN a key tool for cache system developers and researchers. 
    more » « less
  4. Content delivery networks (CDNs) cache and serve a majority of the user-requested content on the Internet. Designing caching algorithms that automatically adapt to the heterogeneity, burstiness, and non-stationary nature of real-world content requests is a major challenge and is the focus of our work. While there is much work on caching algorithms for stationary request traffic, the work on non-stationary request traffic is very limited. Consequently, most prior models are inaccurate for non-stationary production CDN traffic. We propose two TTL-based caching algorithms that provide provable performance guarantees for request traffic that is bursty and nonstationary. The first algorithm called d-TTL dynamically adapts a TTL parameter using stochastic approximation. Given a feasible target hit rate, we show that d-TTL converges to its target value for a general class of bursty traffic that allows Markov dependence over time and non-stationary arrivals. The second algorithm called f-TTL uses two caches, each with its own TTL. The first-level cache adaptively filters out non-stationary traffic, while the second-level cache stores frequently-accessed stationary traffic. Given feasible targets for both the hit rate and the expected cache size, f-TTL asymptotically achieves both targets. We evaluate both d-TTL and f-TTL using an extensive trace containing more than 500 million requests from a production CDN server. We show that both d-TTL and f-TTL converge to their hit rate targets with an error of about 1.3%. But, f-TTL requires a significantly smaller cache size than d-TTL to achieve the same hit rate, since it effectively filters out non-stationary content. 
    more » « less
  5. Photo service providers are facing critical challenges of dealing with the huge amount of photo storage, typically in a magnitude of billions of photos, while ensuring national-wide or world-wide satisfactory user experiences. Distributed photo caching architecture is widely deployed to meet high performance expectations, where efficient still mysterious caching policies play essential roles. In this work, we present a comprehensive study on internet-scale photo caching algorithms in the case of QQPhoto from Tencent Inc., the largest social network service company in China. We unveil that even advanced cache algorithms can only perform at a similar level as simple baseline algorithms and there still exists a large performance gap between these cache algorithms and the theoretically optimal algorithm due to the complicated access behaviors in such a large multi-tenant environment. We then expound the reasons behind this phenomenon via extensively investigating the characteristics of QQPhoto workloads. Finally, in order to realistically further improve QQPhoto cache efficiency, we propose to incorporate a prefetcher in the cache stack based on the observed immediacy feature that is unique to the QQPhoto workload. The prefetcher proactively prefetches selected photos into cache before are requested for the first time to eliminate compulsory misses and promote hit ratios. Our extensive evaluation results show that with appropriate prefetching we improve the cache hit ratio by up to 7.4%, while reducing the average access latency by 6.9% at a marginal cost of 4.14% backend network traffic compared to the original system that performs no prefetching. 
    more » « less