skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Identifying Influential Factors of CDN Performance with Large-scale Data Analysis
Content Distribution Networks (CDNs) manage their own caching or routing overlay networks to provide reliable and efficient content delivery services. Currently, CDNs have become one of the most important tools on the Internet. They have been responsible for the majority of today's Internet traffic. The performance of CDNs directly influences the experiences of end users. In this paper, we develop several analyses to figure out the key factors influencing the overall performance of a CDN. The primary results demonstrate that the caching overlays and the routing overlays both have their own influential factors affecting CDN performance. Our results also show that the transmission latency between a surrogate and a content owner is a critical factor determining the overall performance of routing overlays. Furthermore, we argue that the surrogate assignment policy of a routing overlay need to seriously take this latency into account. Our analysis results provide a context for the CDN community on preferable surrogate assignment solutions.  more » « less
Award ID(s):
1662487
PAR ID:
10077215
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
2018 International Conference on Computing, Networking and Communications (ICNC)
Page Range / eLocation ID:
873 to 877
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Content delivery networks (CDNs) distribute much of today's Internet traffic by caching and serving users' contents requested. A major goal of a CDN is to improve hit probabilities of its caches, thereby reducing WAN traffic and user-perceived latency. In this paper, we develop a new approach for caching in CDNs that learns from optimal caching for decision making. To attain this goal, we first propose HRO to compute the upper bound on optimal caching in an online manner, and then leverage HRO to inform future content admission and eviction. We call this new cache design LHR. We show that LHR is efficient since it includes a detection mechanism for model update, an auto-tuned threshold-based model for content admission with a simple eviction rule. We have implemented an LHR simulator as well as a prototype within an Apache Traffic Server and the Caffeine, respectively. Our experimental results using four production CDN traces show that LHR consistently outperforms state of the arts with an increase in hit probability of up to 9% and a reduction in WAN traffic of up to 15% compared to a typical production CDN cache. Our evaluation of the LHR prototype shows that it only imposes a moderate overhead and can be deployed on today's CDN servers. 
    more » « less
  2. 2022 USENIX Annual Technical Conference (Ed.)
    Caches are pervasively used in content delivery networks (CDNs) to serve requests close to users and thus reduce content access latency. However, designing latency-optimal caches are challenging in the presence of delayed hits, which occur in high-throughput systems when multiple requests for the same content occur before the content is fetched from the remote server. In this paper, we propose a novel timer-based mechanism that provably optimizes the mean caching latency, providing a theoretical basis for the understanding and design of latency-aware (LA) caching that is fundamental to content delivery in latency-sensitive systems. Our timer-based model is able to derive a simple ranking function which quickly informs us the priority of a content for our goal to minimize latency. Based on that we propose a lightweight latency-aware caching algorithm named LA-Cache. We have implemented a prototype within Apache Traffic Server, a popular CDN server. The latency achieved by our implementations agrees closely with theoretical predictions of our model. Our experimental results using production traces show that LA-Cache consistently reduces latencies by 5%-15% compared to state-of-the-art methods depending on the backend RTTs. 
    more » « less
  3. Content delivery networks (CDNs) cache and deliver hundreds of trillions of user requests each day from hundreds of thousands of servers around the world. The traffic served by CDNs can be partitioned into hundreds of traffic classes, each with different user access patterns, popularity distributions, object sizes, and performance requirements. Midgress is the cache miss traffic between the CDN's servers and the content provider origins. A major goal of a CDN is to minimize its midgress, since higher midgress translates to higher bandwidth costs and increased user-perceived latency. We propose algorithms that provision traffic classes to servers such that midgress is minimized. Using extensive traces from Akamai's CDN, we show that our midgress-aware traffic provisioning schemes can reduce midgress by nearly 20% in comparison with the midgress-unaware schemes currently in use. We also propose an efficient heuristic for traffic provisioning that achieves near-optimal midgress and is suitable for use in production settings. Further, we show how our algorithms can be extended to other settings that require minimum caching performance per traffic class and minimum content duplication for fault tolerance. Finally, our paper provides a strong case for implementing midgress-aware traffic provisioning in production CDNs. 
    more » « less
  4. Content delivery networks (CDNs) cache and serve a majority of the user-requested content on the Internet. Designing caching algorithms that automatically adapt to the heterogeneity, burstiness, and non-stationary nature of real-world content requests is a major challenge and is the focus of our work. While there is much work on caching algorithms for stationary request traffic, the work on non-stationary request traffic is very limited. Consequently, most prior models are inaccurate for non-stationary production CDN traffic. We propose two TTL-based caching algorithms that provide provable performance guarantees for request traffic that is bursty and nonstationary. The first algorithm called d-TTL dynamically adapts a TTL parameter using stochastic approximation. Given a feasible target hit rate, we show that d-TTL converges to its target value for a general class of bursty traffic that allows Markov dependence over time and non-stationary arrivals. The second algorithm called f-TTL uses two caches, each with its own TTL. The first-level cache adaptively filters out non-stationary traffic, while the second-level cache stores frequently-accessed stationary traffic. Given feasible targets for both the hit rate and the expected cache size, f-TTL asymptotically achieves both targets. We evaluate both d-TTL and f-TTL using an extensive trace containing more than 500 million requests from a production CDN server. We show that both d-TTL and f-TTL converge to their hit rate targets with an error of about 1.3%. But, f-TTL requires a significantly smaller cache size than d-TTL to achieve the same hit rate, since it effectively filters out non-stationary content. 
    more » « less
  5. Content Delivery Networks (CDNs) are Internet-scale systems that deliver streaming and web content to users from many geographically distributed edge data centers. Since large CDNs can comprise hundreds of thousands of servers deployed in thousands of global data centers, they can consume a large amount of energy for their operations and thus are responsible for large amounts of Green House Gas (GHG) emissions. As these networks scale to cope with increased demand for bandwidth-intensive content, their emissions are expected to rise further, making sustainable design and operation an important goal for the future. Since different geographic regions vary in the carbon intensity and cost of their electricity supply, in this paper, we consider spatial shifting as a key technique to jointly optimize the carbon emissions and energy costs of a CDN. We present two forms of shifting: spatial load shifting, which operates within the time scale of minutes, and VM capacity shifting, which operates at a coarse time scale of days or weeks. The proposed techniques jointly reduce carbon and electricity costs while considering the performance impact of increased request latency from such optimizations. Using real-world traces from a large CDN and carbon intensity and energy prices data from electric grids in different regions, we show that increasing the latency by 60ms can reduce carbon emissions by up to 35.5%, 78.6%, and 61.7% across the US, Europe, and worldwide, respectively. In addition, we show that capacity shifting can increase carbon savings by up to 61.2%. Finally, we analyze the benefits of spatial shifting and show that it increases carbon savings from added solar energy by 68% and 130% in the US and Europe, respectively. 
    more » « less