Pegasus: Tolerating Skewed Workloads in Distributed Storage with In-Network Coherence Directories

Li, Jialin; Nelson, Jacob; Michael, Ellis; Jin, Xin; Ports, Dan R.

Citation Details

High performance distributed storage systems face the challenge of load imbalance caused by skewed and dynamic workloads. This paper introduces Pegasus, a new storage system that leverages new-generation programmable switch ASICs to balance load across storage servers. Pegasus uses selective replication of the most popular objects in the data store to distribute load. Using a novel in-network coherence directory, the Pegasus switch tracks and manages the location of replicated objects. This allows it to achieve load-aware forwarding and dynamic rebalancing for replicated keys, while still guaranteeing data coherence and consistency. The Pegasus design is practical to implement as it stores only forwarding metadata in the switch data plane. The resulting system improves the throughput of a distributed in-memory key-value store by more than 10x under a latency SLO -- results which hold across a large set of workloads with varying degrees of skew, read/write ratio, object sizes, and dynamism. more »

Award ID(s):: 1918757

PAR ID:: 10283420

Author(s) / Creator(s):: Li, Jialin; Nelson, Jacob; Michael, Ellis; Jin, Xin; Ports, Dan R.

Date Published:: 2020-11-04

Journal Name:: 14th USENIX Symposium on Operating Systems Design and Implementation

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this