skip to main content


Title: Fast Nonblocking Persistence for Concurrent Data Structures
We present a fully lock-free variant of our recent Montage system for persistent data structures. The variant, nbMontage, adds persistence to almost any nonblocking concurrent structure without introducing significant overhead or blocking of any kind. Like its predecessor, nbMontage is buffered durably linearizable: it guarantees that the state recovered in the wake of a crash will represent a consistent prefix of pre-crash execution. Unlike its predecessor, nbMontage ensures wait-free progress of the persistence frontier, thereby bounding the number of recent updates that may be lost on a crash, and allowing a thread to force an update of the frontier (i.e., to perform a sync operation) without the risk of blocking. As an extra benefit, the helping mechanism employed by our wait-free sync significantly reduces its latency. Performance results for nonblocking queues, skip lists, trees, and hash tables rival custom data structures in the literature – dramatically faster than achieved with prior general-purpose systems, and generally within 50% of equivalent non-persistent structures placed in DRAM.  more » « less
Award ID(s):
1717712 1900803 1955498
NSF-PAR ID:
10294988
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
35th Intl. Symp. on Distributed Computing (DISC)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The emergence of fast, dense, nonvolatile main memory suggests that certain long-lived data might remain in their natural pointerrich format across program runs and hardware reboots. Operations on such data must currently be instrumented with explicit writeback and fence instructions to ensure consistency in the wake of a crash. Techniques to minimize the cost of this instrumentation are an active topic of research. We present what we believe to be the first general-purpose approach to building buffered persistent data structures, and a system, Montage, to support that approach. Montage is built on top of the Ralloc nonblocking persistent allocator. It employs a millisecondgranularity epoch clock, and ensures that no operation appears to span an epoch boundary. It also arranges to persist only that data minimally required to reconstruct the structure after a crash. If a crash occurs in epoch e, all work performed in epochs e and e − 1 is lost, but work from prior epochs is preserved, consistently. As in traditional file and database systems, a sync operation can be used to flush buffers on demand; the Montage sync is extremely fast. We describe the implementation of Montage, argue its correctness, and report unprecedented throughput for persistent queues, sets/mappings, and general graphs. 
    more » « less
  2. null (Ed.)
    The recent emergence of fast, dense, nonvolatile main memory suggests that certain long-lived data structures might remain in their natural, pointer-rich format across program runs and hardware reboots. Operations on such structures must be instrumented with explicit write-back and fence instructions to ensure consistency in the wake of a crash. Techniques to minimize the cost of this instrumentation are an active topic of current research. We present what we believe to be the first general-purpose approach to building buffered durably linearizable persistent data structures, and a system, Montage, to support that approach. Montage is built on top of the Ralloc nonblocking persistent allocator. It employs a slow-ticking epoch clock, and ensures that no operation appears to span an epoch boundary. If a crash occurs in epoch e, all work performed in epochs e and e-1 is lost, but all work from prior epochs is preserved. We describe the implementation of Montage, argue its correctness, and report on experiments confirming excellent performance for operations on queues, sets/mappings, and general graphs. 
    more » « less
  3. We introduce nonblocking transaction composition (NBTC), a new methodology for atomic composition of nonblocking operations on concurrent data structures. Unlike previous software transactional memory (STM) approaches, NBTC leverages the linearizability of existing nonblocking structures, reducing the number of memory accesses that must be executed together, atomically, to only one per operation in most cases (these are typically the linearizing instructions of the constituent operations). Our obstruction-free implementation of NBTC, which we call Medley, makes it easy to transform most nonblocking data structures into transactional counterparts while preserving their nonblocking liveness and high concurrency. In our experiments, Medley outperforms Lock-Free Transactional Transform (LFTT), the fastest prior competing methodology, by 40--170%. The marginal overhead of Medley's transactional composition, relative to separate operations performed in succession, is roughly 2.2×. For persistent memory, we observe that failure atomicity for transactions can be achieved "almost for free" with epoch-based periodic persistence. Toward that end, we integrate Medley with nbMontage, a general system for periodically persistent data structures. The resulting txMontage provides ACID transactions and achieves throughput up to two orders of magnitude higher than that of the OneFile persistent STM system. 
    more » « less
  4. This paper introduces nonblocking transaction composition (NBTC), a new methodology for atomic composition of nonblocking operations on concurrent data structures. Unlike previous software transactional memory (STM) approaches, NBTC leverages the linearizability of existing nonblocking structures, reducing the number of memory accesses that must be executed together, atomically, to only one per operation in most cases (these are typically the linearizing instructions of the constituent operations). Our obstruction-free implementation of NBTC, which we call Medley, makes it easy to transform most nonblocking data structures into transactional counterparts while preserving their liveness and high concurrency. In our experiments, Medley outperforms Lock-Free Transactional Transform (LFTT), the fastest prior competing methodology, by 40--170%. The marginal overhead of Medley's transactional composition, relative to separate operations performed in succession, is roughly 2.2x. For persistent data structures, we observe that failure atomicity for transactions can be achieved "almost for free'' with epoch-based periodic persistence. Toward that end, we integrate Medley with nbMontage, a general system for periodically persistent data structures. The resulting txMontage provides ACID transactions and achieves throughput up to two orders of magnitude higher than that of the OneFile persistent STM system. 
    more » « less
  5. null (Ed.)
    Newly emerging nonvolatile alternatives to DRAM raise the possibility that applications might compute directly on long-lived data, rather than serializing them to and from a file system or database. To ensure crash consistency, such data must, like a file system or database, provide failure-atomic transactional semantics. Several persistent software transactional memory (STM) systems have been devised to provide these semantics, but only one—the OneFile system of Ramalhete et al.—is nonblocking. Nonblocking progress is desirable to avoid both performance anomalies due to process preemption or failures and deadlock due to priority inversion. Unfortunately, OneFile achieves nonblocking progress at the cost of 2× space overhead, sacrificing much of the cost and density benefit of nonvolatile memory relative to DRAM. OneFile also requires extensive and intrusive changes to data declarations, and works only on a machine with double-width compare-and-swap (CAS) or load-linked/store-conditional (LL/SC) instructions. To address these limitations, we introduce QSTM, a nonblocking persistent STM that requires neither the modification of target data structures nor the availability of a wide CAS instruction. We describe our system, give arguments for safety and liveness, and compare performance to that of the Mnemosyne and OneFile persistent STM systems. We argue that modest performance costs (within a factor of 2 of OneFile in almost all cases) are easily justified by dramatically lower space overhead and higher programmer convenience. 
    more » « less