skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.
Attention:The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 7:00 AM ET to 7:30 AM ET on Friday, April 24 due to maintenance. We apologize for the inconvenience.


Title: LiDO-DAG: A Framework for Verifying Safety and Liveness of DAG-Based Consensus Protocols
Blockchains operating at the global scale demand high-performance byzantine fault-tolerant (BFT) consensus protocols. Most classic PBFT-like protocols suffer from an issue known as the leader bottleneck, which severely limits their throughput and resource utilization. Recently, Directed Acyclic Graph, or DAG-based protocols, have emerged as a promising approach for eliminating the leader bottleneck and achieving better performance. They attain higher throughput by separating data dissemination and block ordering. However, their safety and liveness logic is also significantly more elaborate. So far, most DAG-based protocols have only enjoyed on-paper security proofs, and it is not clear how to construct formal proofs of these protocols efficiently. We introduce LiDO-DAG, a concurrent object model that abstracts the common logic of these protocols. LiDO-DAG is constructed by combining a DAG abstraction and LiDO, a recently proposed abstraction for leader-based consensus. To demonstrate that our framework enables rapid validation of new DAG-based protocol designs, we implemented LiDO-DAG in Coq and applied it to three recent DAG-based protocols, including Narwhal, Bullshark, and Sailfish. Our framework readily yields mechanized safety and liveness proofs for all three protocols, which are also the first mechanized liveness proofs of any DAG-based protocol. Our framework has also revealed an optimization for Sailfish that improves its worst-case latency.  more » « less
Award ID(s):
2313433 2019285
PAR ID:
10638046
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
ACM
Date Published:
Journal Name:
Proceedings of the ACM on Programming Languages
Volume:
9
Issue:
PLDI
ISSN:
2475-1421
Page Range / eLocation ID:
1392 to 1416
Subject(s) / Keyword(s):
Protocol testing and verification formal software verification distributed algorithms DAG-based consensus, safety, liveness, verification, Coq proof assistant
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Ensuring order-fairness in distributed data management systems deployed in untrustworthy environments is crucial to prevent adversarial manipulation of transaction ordering, particularly in unpredictable markets where transaction order directly influences financial outcomes. While Byzantine Fault-Tolerant (BFT) consensus protocols guarantee safety and liveness, they inherently lack mechanisms to enforce order-fairness, exposing distributed systems to attacks such as frontrunning and sandwiching. Previous attempts to integrate order-fairness have often introduced substantial performance overhead, largely due to limitations of the underlying consensus protocols. This paper presents DAG of DAGs (DoD), a high-performance order-fairness protocol designed on top of DAG-based BFT consensus protocols. By leveraging the high throughput and resilience of DAG-based protocols, DoD addresses the performance limitations of existing order-fairness solutions. DoD's novel DAG of DAGs architecture enables seamless integration of order fairness with BFT consensus protocols. Through concurrent block proposals and a wave-based leader election mechanism, DoD significantly improves resilience against adversarial manipulation. A prototype implementation and experimental evaluation demonstrate that DoD effectively provides order fairness with minimal performance overhead. 
    more » « less
  2. Byzantine fault-tolerant state machine replication (SMR) protocols, such as PBFT, HotStuff, and Jolteon, are essential for modern blockchain technologies. However, they are challenging to implement correctly because they have to deal with any unexpected message from byzantine peers and ensure safety and liveness at all times. Many formal frameworks have been developed to verify the safety of SMR implementations, but there is still a gap in the verification of their liveness. Existing liveness proofs are either limited to the network level or do not cover popular partially synchronous protocols. We introduce LiDO, a consensus model that enables the verification of both safety and liveness of implementations through refinement. We observe that current consensus models cannot handle liveness because they do not include a pacemaker state. We show that by adding a pacemaker state to the LiDO model, we can express the liveness properties of SMR protocols as a few safety properties that can be easily verified by refinement proofs. Based on our LiDO model, we provide mechanized safety and liveness proofs for both unpipelined and pipelined Jolteon in Coq. This is the first mechanized liveness proof for a byzantine consensus protocol with non-trivial optimizations such as pipelining. 
    more » « less
  3. Achieving consensus is a challenging and ubiquitous problem in distributed systems that is only made harder by the introduction of malicious byzantine servers. While significant effort has been devoted to the benign and byzantine failure models individually, no prior work has considered the mechanized verification of both in a generic way. We claim this is due to the lack of an appropriate abstraction that is capable of representing both benign and byzantine consensus without either losing too much detail or becoming impractically complex. We build on recent work on the atomic distributed object model to fill this void with a novel abstraction called AdoB. In addition to revealing important insights into the essence of consensus, this abstraction has practical benefits for easing distributed system verification. As a case study, we proved safety and liveness properties for AdoB in Coq, which are the first such mechanized proofs to handle benign and byzantine consensus in a unified manner. We also demonstrate that AdoB faithfully models real consensus protocols by proving it is refined by standard network-level specifications of Fast Paxos and a variant of Jolteon. 
    more » « less
  4. Mechanized verification of liveness properties for infinite programs with effects and nondeterminism is challenging. Existing temporal reasoning frameworks operate at the level of models such as traces and automata. Reasoning happens at a very low-level, requiring complex nested (co-)inductive proof techniques and familiarity with proof assistant mechanics (e.g., the guardedness checker). Further, reasoning at the level of models instead of program constructs creates a verification gap that loses the benefits of modularity and composition enjoyed by structural program logics such as Hoare Logic. To address this verification gap, and the lack of compositional proof techniques for temporal specifications, we propose Ticl, a new structural temporal logic. Using Ticl, we encode complex (co-)inductive proof techniques as structural lemmas and focus our reasoning on variants and invariants. We show that it is possible to perform compositional proofs of general temporal properties in a proof assistant, while working at a high level of abstraction. We demonstrate the benefits of Ticl by giving mechanized proofs of safety and liveness properties for programs with scheduling, concurrent shared memory, and distributed consensus, demonstrating a low proof-to-code ratio. 
    more » « less
  5. Consensus algorithms play a central role in many distributed systems, including blockchains. The most practical consensus algorithms are based on the partial synchrony model. While partially synchronous protocols are relatively simple, they cannot maintain liveness when the message delivery latency is uncertain. Asynchronous protocols do not rely on bounded latency to maintain liveness, but they are much more difficult to understand and implement, being usually described as a composition of several layers of algorithms. Moreover, due to the FLP impossibility theorem, they only provide probabilistic liveness guarantees. These factors make the correctness of asynchronous protocols challenging to verify, and there have been liveness bugs in these protocols that remain unnoticed for years. We introduce SureDistrib, a formal framework for specifying and verifying probabilistic safety and liveness properties of asynchronous distributed protocols. Our framework supports specifying probabilistic algorithms that depend on other probabilistic functionalities, such as binary agreement algorithms depending on common coins. We define refinement relations for such systems, and prove composition lemmas that replace the underlay functionality with an implementation, so that the composed system refines the original system with an abstract underlay. Based on our framework, we give the first mechanized proof that an asynchronous byzantine fault-tolerant binary agreement algorithm terminates with probability 1 (“almost-sure termination”). 
    more » « less