Reasoning about modern datacenter infrastructures using partial histories

Sun, Xudong; Suresh, Lalith; Ganesan, Aishwarya; Alagappan, Ramnatthan; Gasch, Michael; Tang, Lilia; Xu, Tianyin

doi:10.1145/3458336.3465276

Citation Details

Reasoning about modern datacenter infrastructures using partial histories

Modern datacenter infrastructures are increasingly architected as a cluster of loosely coupled services. The cluster states are typically maintained in a logically centralized, strongly consistent data store (e.g., ZooKeeper, Chubby and etcd), while the services learn about the evolving state by reading from the data store, or via a stream of notifications. However, it is challenging to ensure services are correct, even in the presence of failures, networking issues, and the inherent asynchrony of the distributed system. In this paper, we identify that partial histories can be used to effectively reason about correctness for individual services in such distributed infrastructure systems. That is, individual services make decisions based on observing only a subset of changes to the world around them. We show that partial histories, when applied to distributed infrastructures, have immense explanatory power and utility over the state of the art. We discuss the implications of partial histories and sketch tooling for reasoning about distributed infrastructure systems. more »

Award ID(s):: 2029049 1816615

PAR ID:: 10293053

Author(s) / Creator(s):: Sun, Xudong; Suresh, Lalith; Ganesan, Aishwarya; Alagappan, Ramnatthan; Gasch, Michael; Tang, Lilia; Xu, Tianyin

Date Published:: 2021-06-01

Journal Name:: In Proceedings of the 18th Workshop on Hot Topics in Operating Systems (HotOS-XVIII)

Page Range / eLocation ID:: 213 to 220

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3458336.3465276

More Like this