The performance of partially synchronous BFT-based consensus protocols is highly dependent on the primary node. All participant nodes in the network are blocked until they receive a proposal from the primary node to begin the consensus process. Therefore, an honest but slack node (with limited bandwidth) can adversely affect the performance when selected as primary. Hermes decreases protocol dependency on the primary node and minimizes transmission delay induced by the slack primary while keeping low message complexity and latency with high scalability. Hermes achieves these performance improvements by relaxing strong BFT agreement (safety) guarantees only for a specific type of Byzantine faults (also called equivocated faults). Interestingly, we show that in Hermes equivocating by a Byzantine primary is expensive and ineffective. Therefore, the safety of Hermes is comparable to the general BFT consensus. We deployed and tested Hermes on 190 Amazon EC2 instances. In these tests, Hermes's performance was comparable to the state-of-the-art BFT protocol for blockchains (when the network size is large) in the absence of slack nodes. Whereas, in the presence of slack nodes, Hermes outperforms the state-of-the-art BFT protocol significantly in terms of throughput and latency.
more »
« less
Vote Them Out: Detecting and Eliminating Byzantine Peers
Byzantine Fault Tolerant (BFT) protocols are designed to ensure correctness and eventual progress in the face of misbehaving nodes [1]. However, this does not prevent negative effects an adversary may have on performance: a faulty node may significantly affect the latency and throughput of the system without being detected. This is especially true in speculative protocols optimized for the best-case where a single leader can force the protocol into the worst case [3]. Systems like Aardvark [2] that are designed to maximize worst-case performance tolerate byzantine behavior without necessarily detecting who the perpetrator is. By forcing regular view changes, for example, they mitigate the effects of leaders who deliberately delay dissemination of messages, even if this behavior would be difficult to prove to a third party. Byzantine faults, by definition, can be difficult to detect. An error of 'commission', such as a message with a mismatching digest, can be proven. Errors of 'omission', such as delaying or failing to relay a message, as a rule cannot be proven, and the node responsible for these types of omission faults may not appear faulty to all observers. Nevertheless, we observe that they can reliably be detected. Designing protocols that detect and eject nodes is challenging for two reasons. First, some behaviors are observed by a subset of honest nodes and cannot be objectively proven to a third party. Second, any mechanism capable of ejecting nodes could be subverted by Byzantine nodes to eject honest nodes. This paper presents the Protocol for Ejecting All Corrupted Hosts (Peach, a mechanism for detecting and ejecting faulty nodes in Byzantine fault tolerant (BFT) protocols. Nodes submit votes to a trusted configuration manager that replaces faulty nodes once a threshold of votes are received. We implement Peach for two BFT protocol variants, a traditional pbft-style three-phase protocol and a speculative protocol, and evaluate its ability to respond to Byzantine behavior. This work makes the following contributions: (1) We present and prove a necessary and sufficient constraint on cluster membership guaranteeing that any nodes causing performance degradation via acts of omission will be detected. (2) We present an agreement protocol, PEACHes, in which replicas pass votes about their subjective local observations of possible omissions to a TTP. (3) We show how the separation of detection and effectuation allows fine-grained detection of malicious behavior that is compatible and easily integrated with existing systems. (4) We present DecentBFT, an extension of BFT-Smart to which we added a speculative fast path (similar to Zyzzva) and integrated PEACHes. (5) We show DecentBFT rapidly detects and mitigates a variety of performance attacks that would have gone undetected by the state of the art.
more »
« less
- Award ID(s):
- 1750060
- PAR ID:
- 10249862
- Date Published:
- Journal Name:
- SoCC '19: Proceedings of the ACM Symposium on Cloud Computing
- Page Range / eLocation ID:
- 480 to 480
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Byzantine Fault Tolerant (BFT) protocols serve as a fundamental yet intricate component of distributed data management systems in untrustworthy environments. BFT protocols exhibit different design principles and performance characteristics under varying workloads and fault scenarios. The proliferation of BFT protocols and their growing complexity have made it increasingly challenging to analyze the performance and possible application scenarios of each protocol. This demonstration showcasesBFTGym, an interactive platform that allows audience members to (1) evaluate, compare, and gather insights into the performance of various BFT protocols under a wide range of conditions, and (2) prototype new BFT protocols rapidly.more » « less
-
Bessani, Alysson; Défago, Xavier; Nakamura, Junya; Wada, Koichi; Yamauchi, Yukiko (Ed.)This paper studies the design of Byzantine consensus algorithms in an asynchronous single-hop network equipped with the "abstract MAC layer" [DISC09], which captures core properties of modern wireless MAC protocols. Newport [PODC14], Newport and Robinson [DISC18], and Tseng and Zhang [PODC22] study crash-tolerant consensus in the model. In our setting, a Byzantine faulty node may behave arbitrarily, but it cannot break the guarantees provided by the underlying abstract MAC layer. To our knowledge, we are the first to study Byzantine faults in this model. We harness the power of the abstract MAC layer to develop a Byzantine approximate consensus algorithm and a Byzantine randomized binary consensus algorithm. Both of our algorithms require only the knowledge of the upper bound on the number of faulty nodes f, and do not require the knowledge of the number of nodes n. This demonstrates the "power" of the abstract MAC layer, as consensus algorithms in traditional message-passing models require the knowledge of both n and f. Additionally, we show that it is necessary to know f in order to reach consensus. Hence, from this perspective, our algorithms require the minimal knowledge. The lack of knowledge of n brings the challenge of identifying a quorum explicitly, which is a common technique in traditional message-passing algorithms. A key technical novelty of our algorithms is to identify "implicit quorums" which have the necessary information for reaching consensus. The quorums are implicit because nodes do not know the identity of the quorums - such notion is only used in the analysis.more » « less
-
null (Ed.)Many aspects of blockchain-based decentralized finance can be understood as an extension of classical distributed computing. In this paper, we trace the evolution of two interrelated notions: failure and fault-tolerance. In classical distributed computing, a failure to complete a multi-party protocol is typically attributed to hardware malfunctions. A fault-tolerant protocol is one that responds to such failures by rolling the system back to an earlier consistent state. In the presence of Byzantine failures, a failure may be the result of an attack, and a fault-tolerant protocol is one that ensures that attackers will be punished and victims compensated. In modern decentralized finance however, failure to complete a protocol can be considered a legitimate option, not a transgression. A fault-tolerant protocol is one that ensures that the party offering the option cannot renege, and the party purchasing the option provides fair compensation (in the form of a fee) to the offering party. We sketch the evolution of such protocols, starting with two-phase commit, and finishing with timed hashlocked smart contracts.more » « less
-
In a key-agreement protocol whose security is proven in the random oracle model (ROM), the parties and the eavesdropper can make bounded number of queries to a shared random function (an “oracle”). Such protocol are the alternative to key-agreement protocols whose security is based on “public-key assumptions”, assumptions that being more structured are presumingly more vulnerable to attacks. Barak and Mahmoody [Crypto ’09] (following Impagliazzo and Rudich [STOC ’89]) have shown the ROM key-agreement protocols can only guarantee limited secrecy: the key of any `l-query protocol can be revealed by an O(l^2 )-query adversary, a bound that matches the gap obtained by the Merkle’s Puzzles two-message protocol of Merkle [CACM ’78]. While this quadratic gap might not seem like much, if the honest parties are willing to work “hard enough” and given continuousness improvement in common hash functions evaluation time, this gap yields a good enough advantage (assuming the security of the protocol holds when initiating the random function with a fixed hash function). In this work we consider the communication complexity of ROM key-agreement protocols. In Merkle’s Puzzles, the honest parties need to exchange Ω(l) bits (ignoring logarithmic factors) to obtain secrecy against an eavesdropper that makes roughly l^2 queries, which makes the protocol unrealizable in many settings. We show that for protocols with certain natural properties, such high communication is unavoidable. Specifically, this is the case if the honest parties’ queries are independent and uniformly random, or alternatively if the protocol uses non-adaptive queries and has only two rounds. Since two-round key-agreement protocol are equivalent to public-key encryption scheme (seeing the first message as the public-key), the latter result bounds the public-key and encryption size of public-key encryption scheme whose security is proven in the ROM.more » « less