skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The DOI auto-population feature in the Public Access Repository (PAR) will be unavailable from 4:00 PM ET on Tuesday, July 8 until 4:00 PM ET on Wednesday, July 9 due to scheduled maintenance. We apologize for the inconvenience caused.


Title: Seer: Enabling Future-Aware Online Caching in Networked Systems
State-intensive network and distributed applications rely heavily on online caching heuristics for high performance. However, there remains a fundamental performance gap between online caching heuristics and the optimal offline caching algorithm due to the lack of visibility into future state access requests in an online setting. Driven by the observation that state access requests in network and distributed applications are often carried in incoming network packets, we present Seer, an online caching solution for networked systems, that exploits the delays experienced by a packet inside a network - most prominently, transmission and queuing delays - to notify in advance of future packet arrivals to the target network nodes (switches/routers/middleboxes/end-hosts) implementing caching. Using this as a building block, Seer presents the design of an online cache manager that leverages visibility into (partial) set of future state access requests to make smarter prefetching and cache eviction decisions. Our evaluations show that Seer achieves up to 65% lower cache miss ratio and up to 78% lower flow completion time compared to LRU for key network applications over realistic workloads.  more » « less
Award ID(s):
2239829
PAR ID:
10570725
Author(s) / Creator(s):
;
Publisher / Repository:
USENIX Symposium on Networked Systems Design and Implementation (NSDI)
Date Published:
ISBN:
978-1-939133-39-7
Page Range / eLocation ID:
635 - 649
Format(s):
Medium: X
Location:
Santa Clara, CA
Sponsoring Org:
National Science Foundation
More Like this
  1. Caches are at the heart of latency-sensitive systems. In this paper, we identify a growing challenge for the design of latency-minimizing caches called delayed hits. Delayed hits occur at high throughput, when multiple requests to the same object queue up before an outstanding cache miss is resolved. This effect increases latencies beyond the predictions of traditional caching models and simulations; in fact, caching algorithms are designed as if delayed hits simply didn't exist. We show that traditional caching strategies -- even so called 'optimal' algorithms -- can fail to minimize latency in the presence of delayed hits. We design a new, latency-optimal offline caching algorithm called belatedly which reduces average latencies by up to 45% compared to the traditional, hit-rate optimal Belady's algorithm. Using belatedly as our guide, we show that incorporating an object's 'aggregate delay' into online caching heuristics can improve latencies for practical caching systems by up to 40%. We implement a prototype, Minimum-AggregateDelay (mad), within a CDN caching node. Using a CDN production trace and backends deployed in different geographic locations, we show that mad can reduce latencies by 12-18% depending on the backend RTTs. 
    more » « less
  2. null (Ed.)
    This paper presents Packet Chasing, an attack on the network that does not require access to the network, and works regardless of the privilege level of the process receiving the packets. A spy process can easily probe and discover the exact cache location of each buffer used by the network driver. Even more useful, it can discover the exact sequence in which those buffers are used to receive packets. This then enables packet frequency and packet sizes to be monitored through cache side channels. This allows both covert channels between a sender and a remote spy with no access to the network, as well as direct attacks that can identify, among other things, the web page access patterns of a victim on the network. In addition to identifying the potential attack, this work proposes a software-based short-term mitigation as well as a light-weight, adaptive, cache partitioning mitigation that blocks the interference of I/O and CPU requests in the last-level cache. 
    more » « less
  3. null (Ed.)
    Crucial performance metrics of a caching algorithm include its ability to quickly and accurately learn a popularity distribution of requests. However, a majority of work on analytical performance analysis focuses on hit probability after an asymptotically large time has elapsed. We consider an online learning viewpoint, and characterize the ``regret'' in terms of the finite time difference between the hits achieved by a candidate caching algorithm with respect to a genie-aided scheme that places the most popular items in the cache. We first consider the Full Observation regime wherein all requests are seen by the cache. We show that the Least Frequently Used (LFU) algorithm is able to achieve order optimal regret, which is matched by an efficient counting algorithm design that we call LFU-Lite. We then consider the Partial Observation regime wherein only requests for items currently cached are seen by the cache, making it similar to an online learning problem related to the multi-armed bandit problem. We show how approaching this ``caching bandit'' using traditional approaches yields either high complexity or regret, but a simple algorithm design that exploits the structure of the distribution can ensure order optimal regret. We conclude by illustrating our insights using numerical simulations. 
    more » « less
  4. Photo service providers are facing critical challenges of dealing with the huge amount of photo storage, typically in a magnitude of billions of photos, while ensuring national-wide or world-wide satisfactory user experiences. Distributed photo caching architecture is widely deployed to meet high performance expectations, where efficient still mysterious caching policies play essential roles. In this work, we present a comprehensive study on internet-scale photo caching algorithms in the case of QQPhoto from Tencent Inc., the largest social network service company in China. We unveil that even advanced cache algorithms can only perform at a similar level as simple baseline algorithms and there still exists a large performance gap between these cache algorithms and the theoretically optimal algorithm due to the complicated access behaviors in such a large multi-tenant environment. We then expound the reasons behind this phenomenon via extensively investigating the characteristics of QQPhoto workloads. Finally, in order to realistically further improve QQPhoto cache efficiency, we propose to incorporate a prefetcher in the cache stack based on the observed immediacy feature that is unique to the QQPhoto workload. The prefetcher proactively prefetches selected photos into cache before are requested for the first time to eliminate compulsory misses and promote hit ratios. Our extensive evaluation results show that with appropriate prefetching we improve the cache hit ratio by up to 7.4%, while reducing the average access latency by 6.9% at a marginal cost of 4.14% backend network traffic compared to the original system that performs no prefetching. 
    more » « less
  5. Coded multicasting has been shown to be a promising approach to significantly improve the performance of content delivery networks with multiple caches downstream of a common multicast link. However, the schemes that have been shown to achieve order-optimal performance require content items to be partitioned into several packets that grows exponentially with the number of caches, leading to codes of exponential complexity that jeopardize their promising performance benefits. In this paper, we address this crucial performance-complexity tradeoff in a heterogeneous caching network setting, where edge caches with possibly different storage capacity collect multiple content requests that may follow distinct demand distributions. We extend the asymptotic (in the number of packets per file) analysis of shared link caching networks to heterogeneous network settings, and present novel coded multicast schemes, based on local graph coloring, that exhibit polynomial-time complexity in all the system parameters, while preserving the asymptotically proven multiplicative caching gain even for finite file packetization. We further demonstrate that the packetization order (the number of packets each file is split into) can be traded-off with the number of requests collected by each cache, while preserving the same multiplicative caching gain. Simulation results confirm the superiority of the proposed schemes and illustrate the interesting request aggregation vs. packetization order tradeoff within several practical settings. Our results provide a compelling step towards the practical achievability of the promising multiplicative caching gain in next generation access networks. 
    more » « less