skip to main content


Title: Verifying concurrent, crash-safe systems with Perennial
This paper introduces Perennial, a framework for verifying concurrent, crash-safe systems. Perennial extends the Iris concurrency framework with three techniques to enable crash-safety reasoning: recovery leases, recovery helping, and versioned memory. To ease development and deployment of applications, Perennial provides Goose, a subset of Go and a translator from that subset to a model in Perennial with support for reasoning about Go threads, data structures, and file-system primitives. We implemented and verified a crash-safe, concurrent mail server using Perennial and Goose that achieves speedup on multiple cores. Both Perennial and Iris use the Coq proof assistant, and the mail server and the framework's proofs are machine checked.  more » « less
Award ID(s):
1836712 1563763
NSF-PAR ID:
10115003
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the 27th ACM Symposium on Operating Systems Principles (SOSP)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The main contribution of this paper is GoJournal, a verified, concurrent journaling system that provides atomicity for storage applications, together with Perennial 2.0, a framework for formally specifying and verifying concurrent crash-safe systems. GoJournal’s goal is to bring the advantages of journaling for code to specs and proofs. Perennial 2.0 makes this possible by introducing several techniques to formalize GoJournal’s specification and to manage the complexity in the proof of GoJournal’s implementation. Lifting predicates and crash framing make the specification easy to use for developers, and logically atomic crash specifications allow for modular reasoning in GoJournal, making the proof tractable despite complex concurrency and crash interleavings. GoJournal is implemented in Go, and Perennial is implemented in the Coq proof assistant. While verifying GoJournal, we found one serious concurrency bug, even though GoJournal has many unit tests. We built a functional NFSv3 server, called GoNFS, to use GoJournal. Performance experiments show that GoNFS provides similar performance (e.g., at least 90% throughput across several benchmarks on an NVMe disk) to Linux’s NFS server exporting an ext4 file system, suggesting that GoJournal is a competitive journaling system. We also verified a simple NFS server using GoJournal’s specs, which confirms that they are helpful for application verification: a significant part of the proof doesn’t have to consider concurrency and crashes. 
    more » « less
  2. This paper develops a new approach to verifying a performant file system that isolates crash safety and concurrency reasoning to a transaction system that gives atomic access to the disk, so that the rest of the file system can be verified with sequential reasoning. We demonstrate this approach in DaisyNFS, a Network File System (NFS) server written in Go that runs on top of a disk. DaisyNFS uses GoTxn, a new verified, concurrent transaction system that extends GoJournal with two-phase locking and an allocator. The transaction system's specification formalizes under what conditions transactions can be verified with only sequential reasoning, and comes with a mechanized proof in Coq that connects the specification to the implementation. As evidence that proofs enjoy sequential reasoning, DaisyNFS uses Dafny, a sequential verification language, to implement and verify all the NFS operations on top of GoTxn. The sequential proofs helped achieve a number of good properties in DaisyNFS: easy incremental development (for example, adding support for large files), a relatively short proof (only 2x as many lines of proof as code), and a performant implementation (at least 60% the throughput of the Linux NFS server exporting ext4 across a variety of benchmarks). 
    more » « less
  3. Reasoning about storage systems is challenging because these systems make persistence guarantees even if the system crashes at any point. To achieve these crash-safety guarantees, storage systems include recovery procedures to restore the system to a consistent state after a crash. Moreover, large-scale systems are structured as multiple stacked layers and can require recovery at multiple layers of abstraction. Formal verification can ensure that crash-safety guarantees hold regardless of when the system crashes. To make verification tractable, large-scale systems should be verified in a modular fashion, layer-by-layer in the software stack. Layered recovery makes modularity challenging because the system can crash in the middle of a high-level recovery procedure and must start over from the low-level recovery procedure. We present Argosy, a framework for machine-checked proofs of storage systems that supports layered recovery implementations with modular proofs. The framework is based on combinators for transition relations that are inspired by Kleene algebra, which provides a convenient formalism for specifying and reasoning about crashes and recovery. On top of this framework, we implement Crash Hoare Logic (CHL), the program logic used by FSCQ. Using the logic, we have verified an example of layered recovery featuring a write-ahead log on top of a disk, which itself runs by replicating over two unreliable disks. The metatheory of the framework, the soundness of the program logic, and these examples are all verified in the Coq theorem prover. 
    more » « less
  4. Go is a young programming language invented to build safe and efficient concurrent programs. It provides goroutines as lightweight threads and channels for inter-goroutine communication. Programmers are encouraged to explicitly pass messages through channels to connect goroutines, with the purpose of reducing the chance of making programming mistakes and introducing concurrency bugs. Go is one of the most beloved programming languages and has already been used to build many critical infrastructure software systems in the data-center environment. However, a recent study shows that channel-related concurrency bugs are still common in Go programs, severely hurting the reliability of the programs. This paper presents GFuzz, a dynamic detector that can effectively pinpoint channel-related concurrency bugs by mutating the processing orders of concurrent messages. We build GFuzz in three steps. We first adopt an effective approach to identify concurrent messages and transform a program to process those messages in any given order. We then take a fuzzing approach to generate new processing orders by mutating exercised ones and rely on execution feedback to prioritize orders close to triggering bugs. Finally, we design a runtime sanitizer to capture triggered bugs that are missed by the Go runtime. We evaluate GFuzz on seven popular Go software systems, including Docker, Kubernetes, and gRPC. GFuzz finds 184 previously unknown bugs and reports a negligible number of false positives. Programmers have already confirmed 124 reports as real bugs and fixed 67 of them based on our reporting. A careful inspection of the detected concurrency bugs from gRPC shows the effectiveness of each component of GFuzz and confirms the components' rationality. 
    more » « less
  5. The universal composability (UC) framework is the established standard for analyzing cryptographic protocols in a modular way, such that security is preserved under concurrent composition with arbitrary other protocols. However, although UC is widely used for on-paper proofs, prior attempts at systemizing it have fallen short, either by using a symbolic model (thereby ruling out computational reduction proofs), or by limiting its expressiveness. In this paper, we lay the groundwork for building a concrete, executable implementation of the UC framework. Our main contribution is a process calculus, dubbed the Interactive Lambda Calculus (ILC). ILC faithfully captures the computational model underlying UC—interactive Turing machines (ITMs)—by adapting ITMs to a subset of the π-calculus through an affine typing discipline. In other words, well-typed ILC programs are expressible as ITMs. In turn, ILC’s strong confluence property enables reasoning about cryptographic security reductions. We use ILC to develop a simplified implementation of UC called SaUCy. 
    more » « less