To guard against machine failures, modern internet services store multiple replicas of the same application data within and across data centers, which introduces the problem of keeping geodistributed replicas consistent with one another in the face of network partitions and unpredictable message latency. To avoid costly and conservative synchronization protocols, many real-world systems provide only weak consistency guarantees (e.g., eventual, causal, or PRAM consistency), which permit certain kinds of disagreement among replicas. There has been much recent interest in language support for specifying and verifying such consistency properties. Although these properties are usually beyond the scope of what traditional type checkers or compiler analyses can guarantee, solver-aided languages are up to the task. Inspired by systems like Liquid Haskell [43] and Rosette [42], we believe that close integration between a language and a solver is the right path to consistent-by-construction distributed applications. Unfortunately, verifying distributed consistency properties requires reasoning about transitive relations (e.g., causality or happens-before), partial orders (e.g., the lattice of replica states under a convergent merge operation), and properties relevant to message processing or API invocation (e.g., commutativity and idempotence) that cannot be easily or efficiently carried out by general-purpose SMT solvers that lack native support for this kind of reasoning. We argue that domain-specific SMT-based tools that exploit the mathematical foundations of distributed consistency would enable both more efficient verification and improved ease of use for domain experts. The principle of exploiting domain knowledge for efficiency and expressivity that has borne fruit elsewhere — such as in the development of high-performance domain-specific languages that trade off generality to gain both performance and productivity — also applies here. Languages augmented with domain-specific, consistency-aware solvers would support the rapid implementation of formally verified programming abstractions that guarantee distributed consistency. In the long run, we aim to democratize the development of such domain-specific solvers by creating a framework for domain-specific solver development that brings new theory solver implementation within the reach of programmers who are not necessarily SMT solver internals experts.
more »
« less
Verified Causal Broadcast with Liquid Haskell
Protocols to ensure that messages are delivered in causal order are a ubiquitous building block of distributed systems. For instance, distributed data storage systems can use causally ordered message delivery to ensure causal consistency, and CRDTs can rely on the existence of an underlying causally-ordered messaging layer to simplify their implementation. A causal delivery protocol ensures that when a message is delivered to a process, any causally preceding messages sent to the same process have already been delivered to it. While causal delivery protocols are widely used, verification of their correctness is less common, much less machine-checked proofs about executable implementations. We implemented a standard causal broadcast protocol in Haskell and used the Liquid Haskell solver-aided verification system to express and mechanically prove that messages will never be delivered to a process in an order that violates causality. We express this property using refinement types and prove that it holds of our implementation, taking advantage of Liquid Haskell’s underlying SMT solver to automate parts of the proof and using its manual theorem-proving features for the rest. We then put our verified causal broadcast implementation to work as the foundation of a distributed key-value store.
more »
« less
- Award ID(s):
- 2145367
- PAR ID:
- 10425661
- Editor(s):
- Elsman, Martin
- Date Published:
- Journal Name:
- IFL '22: Proceedings of the 34th Symposium on Implementation and Application of Functional Languages
- Page Range / eLocation ID:
- 1 to 13
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Oshman, Rotem (Ed.)Broadcast protocols enable a set of n parties to agree on the input of a designated sender, even in the face of malicious parties who collude to attack the protocol. In the honest-majority setting, a fruitful line of work harnessed randomization and cryptography to achieve low-communication broadcast protocols with sub-quadratic total communication and with "balanced" sub-linear communication cost per party. However, comparatively little is known in the dishonest-majority setting. Here, the most communication-efficient constructions are based on the protocol of Dolev and Strong (SICOMP '83), and sub-quadratic broadcast has not been achieved even using randomization and cryptography. On the other hand, the only nontrivial ω(n) communication lower bounds are restricted to deterministic protocols, or against strong adaptive adversaries that can perform "after the fact" removal of messages. We provide communication lower bounds in this space, which hold against arbitrary cryptography and setup assumptions, as well as a simple protocol showing near tightness of our first bound. - Static adversary. We demonstrate a tradeoff between resiliency and communication for randomized protocols secure against n-o(n) static corruptions. For example, Ω(n⋅ polylog(n)) messages are needed when the number of honest parties is n/polylog(n); Ω(n√n) messages are needed for O(√n) honest parties; and Ω(n²) messages are needed for O(1) honest parties. Complementarily, we demonstrate broadcast with O(n⋅polylog(n)) total communication and balanced polylog(n) per-party cost, facing any constant fraction of static corruptions. - Weakly adaptive adversary. Our second bound considers n/2 + k corruptions and a weakly adaptive adversary that cannot remove messages "after the fact." We show that any broadcast protocol within this setting can be attacked to force an arbitrary party to send messages to k other parties. Our bound implies limitations on the feasibility of balanced low-communication protocols: For example, ruling out broadcast facing 51% corruptions, in which all non-sender parties have sublinear communication locality.more » « less
-
Attiya, Hagit (Ed.)The standard population protocol model assumes that when two agents interact, each observes the entire state of the other agent. We initiate the study of the message complexity for population protocols, where the state of an agent is divided into an externally-visible message and an internal component, where only the message can be observed by the other agent in an interaction. We consider the case of O(1) message complexity. When time is unrestricted, we obtain an exact characterization of the stably computable predicates based on the number of internal states s(n): If s(n) = o(n) then the protocol computes a semilinear predicate (unlike the original model, which can compute non-semilinear predicates with s(n) = O(log n)), and otherwise it computes a predicate decidable by a nondeterministic O(n log s(n))-space-bounded Turing machine. We then consider time complexity, introducing novel O(polylog(n)) expected time protocols for junta/leader election and general purpose broadcast correct with high probability, and approximate and exact population size counting correct with probability 1. Finally, we show that the main constraint on the power of bounded-message-size protocols is the size of the internal states: with unbounded internal states, any computable function can be computed with probability 1 in the limit by a protocol that uses only one-bit messages.more » « less
-
This paper focuses on showing time-message trade-offs in distributed algorithms for fundamental problems such as leader election, broadcast, spanning tree (ST), minimum spanning tree (MST), minimum cut, and many graph verification problems. We consider the synchronous CONGEST distributed computing model and assume that each node has initial knowledge of itself and the identifiers of its neighbors - the so-called KT_1 model - a well-studied model that also naturally arises in many applications. Recently, it has been established that one can obtain (almost) singularly optimal algorithms, i.e., algorithms that have simultaneously optimal time and message complexity (up to polylogarithmic factors), for many fundamental problems in the standard KT_0 model (where nodes have only local knowledge of themselves and not their neighbors). The situation is less clear in the KT_1 model. In this paper, we present several new distributed algorithms in the KT_1 model that trade off between time and message complexity. Our distributed algorithms are based on a uniform and general approach which involves constructing a sparsified spanning subgraph of the original graph - called a danner - that trades off the number of edges with the diameter of the sparsifier. In particular, a key ingredient of our approach is a distributed randomized algorithm that, given a graph G and any delta in [0,1], with high probability constructs a danner that has diameter O~(D + n^{1-delta}) and O~(min{m,n^{1+delta}}) edges in O~(n^{1-delta}) rounds while using O~(min{m,n^{1+delta}}) messages, where n, m, and D are the number of nodes, edges, and the diameter of G, respectively. Using our danner construction, we present a family of distributed randomized algorithms for various fundamental problems that exhibit a trade-off between message and time complexity and that improve over previous results. Specifically, we show the following results (all hold with high probability) in the KT_1 model, which subsume and improve over prior bounds in the KT_1 model (King et al., PODC 2014 and Awerbuch et al., JACM 1990) and the KT_0 model (Kutten et al., JACM 2015, Pandurangan et al., STOC 2017 and Elkin, PODC 2017): 1) Leader Election, Broadcast, and ST. These problems can be solved in O~(D+n^{1-delta}) rounds using O~(min{m,n^{1+delta}}) messages for any delta in [0,1]. 2) MST and Connectivity. These problems can be solved in O~(D+n^{1-delta}) rounds using O~(min{m,n^{1+delta}}) messages for any delta in [0,0.5]. In particular, for delta = 0.5 we obtain a distributed MST algorithm that runs in optimal O~(D+sqrt{n}) rounds and uses O~(min{m,n^{3/2}}) messages. We note that this improves over the singularly optimal algorithm in the KT_0 model that uses O~(D+sqrt{n}) rounds and O~(m) messages. 3) Minimum Cut. O(log n)-approximate minimum cut can be solved in O~(D+n^{1-delta}) rounds using O~(min{m,n^{1+delta}}) messages for any delta in [0,0.5]. 4) Graph Verification Problems such as Bipartiteness, Spanning Subgraph etc. These can be solved in O~(D+n^{1-delta}) rounds using O~(min{m,n^{1+delta}}) messages for any delta in [0,0.5].more » « less
-
Distributed systems are notoriously hard to implement correctly due to non-determinism. Finding the inductive invariant of the distributed protocol is a critical step in verifying the correctness of distributed systems, but takes a long time to do even for simple protocols. We present DistAI, a data-driven automated system for learning inductive invariants for distributed protocols. DistAI generates data by simulating the distributed protocol at different instance sizes and recording states as samples. Based on the observation that invariants are often concise in practice, DistAI starts with small invariant formulas and enumerates all strongest possible invariants that hold for all samples. It then feeds those invariants and the desired safety properties to an SMT solver to check if the conjunction of the invariants and the safety properties is inductive. Starting with small invariant formulas and strongest possible invariants avoids large SMT queries, improving SMT solver performance. Because DistAI starts with the strongest possible invariants, if the SMT solver fails, DistAI does not need to discard failed invariants, but knows to monotonically weaken them and try again with the solver, repeating the process until it eventually succeeds. We prove that DistAI is guaranteed to find the∃-free inductive invariant that proves the desired safety properties in finite time, if one exists. Our evaluation shows that DistAI successfully verifies 13 common distributed protocols automatically and outperforms alternative methods both in the number of protocols it verifies and the speed at which it does so, in some cases by more than two orders of magnitude.more » « less