skip to main content


Title: Minimizing content staleness in dynamo-style replicated storage systems
Consistency in data storage systems requires any read operation to return the most recent written version of the content. In replicated storage systems, consistency comes at the price of delay due to large-scale write and read operations. Many applications with low latency requirements tolerate data staleness in order to provide high availability and low operation latency. Using age of information as the staleness metric, we examine a data updating system in which real-time content updates are replicated and stored in a Dynamo-style quorum-based distributed system. A source sends updates to all the nodes in the system and waits for acknowledgements from the earliest subset of nodes, known as a write quorum. An interested client fetches the update from another set of nodes, defined as a read quorum. We analyze the staleness-delay tradeoff in replicated storage by varying the write quorum size. With a larger write quorum, an instantaneous read is more likely to get the latest update written by the source. However, the age of the content written to the system is more likely to become stale as the write quorum size increases. For shifted exponential distributed write delay, we derive the age optimized write quorum size that balances the likelihood of reading the latest update and the freshness of the latest update written by the source.  more » « less
Award ID(s):
1717041
NSF-PAR ID:
10109204
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
2018 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS): AoI Workshop 2018
Page Range / eLocation ID:
361 to 366
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The adoption of low latency persistent memory modules (PMMs) upends the long-established model of remote storage for distributed file systems. Instead, by colocating computation with PMM storage, we can provide applications with much higher IO performance, sub-second application failover, and strong consistency. To demonstrate this, we built the Assise distributed file system, based on a persistent, replicated coherence protocol that manages client-local PMM as a linearizable and crash-recoverable cache between applications and slower (and possibly remote) storage. Assise maximizes locality for all file IO by carrying out IO on process-local, socket-local, and client-local PMM whenever possible. Assise minimizes coherence overhead by maintaining consistency at IO operation granularity, rather than at fixed block sizes. We compare Assise to Ceph/BlueStore, NFS, and Octopus on a cluster with Intel Optane DC PMMs and SSDs for common cloud applications and benchmarks, such as LevelDB, Postfix, and FileBench. We find that Assise improves write latency up to 22x, throughput up to 56x, fail-over time up to 103x, and scales up to 6x better than its counterparts, while providing stronger consistency semantics. 
    more » « less
  2. null (Ed.)
    High performance distributed storage systems face the challenge of load imbalance caused by skewed and dynamic workloads. This paper introduces Pegasus, a new storage system that leverages new-generation programmable switch ASICs to balance load across storage servers. Pegasus uses selective replication of the most popular objects in the data store to distribute load. Using a novel in-network coherence directory, the Pegasus switch tracks and manages the location of replicated objects. This allows it to achieve load-aware forwarding and dynamic rebalancing for replicated keys, while still guaranteeing data coherence and consistency. The Pegasus design is practical to implement as it stores only forwarding metadata in the switch data plane. The resulting system improves the throughput of a distributed in-memory key-value store by more than 10x under a latency SLO -- results which hold across a large set of workloads with varying degrees of skew, read/write ratio, object sizes, and dynamism. 
    more » « less
  3. null (Ed.)
    The infrastructure available to large-scale and medium-scale web services now spans dozens of geographically dispersed datacenters. Deploying across many datacenters has the potential to significantly reduce end-user latency by serving users nearer their location. However, deploying across many datacenters requires the backend storage system be partially replicated. In turn, this can sacrifice the low latency benefits of many datacenters, especially when a storage system provides guarantees on what operations will observe. We present the K2 storage system that provides lower latency for large-scale and medium-scale web services using partial replication of data over many datacenters with strong guarantees: causal consistency, read-only transactions, and write-only transactions. K2 provides the best possible worst-case latency for partial replication, a single round trip to remote datacenters, and often avoids sending any requests to far away datacenters using a novel replication approach, write-only transaction algorithm, and read-only transaction algorithm. 
    more » « less
  4. We consider a multicast network in which real-time status updates generated by a source are replicated and sent to multiple interested receiving nodes through independent links. The receiving nodes are divided into two groups: one priority group consists of k nodes that require the reception of every update packet, the other non-priority group consists of all other nodes without the delivery requirement. Using age of information as a freshness metric, we analyze the time-averaged age at both priority and non-priority nodes. For shifted-exponential link delay distributions, the average age at a priority node is lower than that at a non-priority node due to the delivery guarantee. However, this advantage for priority nodes disappears if the link delay is exponential distributed. Both groups of nodes have the same time-averaged age, which implies that the guaranteed delivery of updates has no effect the time-averaged freshness. 
    more » « less
  5. One main objective of ultra-low-latency communications is to minimize the data staleness at the receivers, recently characterized by a metric called Age-of-Information (AoI). While the question of when to send the next update packet has been the central subject of AoI minimization, each update packet also incurs the cost of transmission that needs to be jointly considered in a practical design. With the exponential growth of interconnected devices and the increasing risk of excessive resource consumption in mind, this work derives an optimal joint cost-and-AoI minimization solution for multiple coexisting source-destination (S-D) pairs. The results admit a new AoI-market-price-based interpretation and are applicable to the setting of (a) general heterogeneous AoI penalty functions and Markov delay distributions for each S-D pair, and (b) a general network cost function of aggregate throughput of all S-D pairs. Extensive simulation is used to demonstrate the superior performance of the proposed scheme. 
    more » « less