NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Fast and fair randomized wait-free locks

https://doi.org/10.1007/s00446-024-00474-4

Ben-David, Naama; Blelloch, Guy E (March 2025, Distributed Computing)

Abstract We present a randomized approach for wait-free locks with strong bounds on time and fairness in a context in which any process can be arbitrarily delayed. Our approach supports a tryLock operation that is given a set of locks, and code to run when all the locks are acquired. A tryLock operation may fail if there is contention on the locks, in which case the code is not run. Given an upper bound$$\kappa $$ $κ$ known to the algorithm on the point contention of any lock, and an upper boundLon the number of locks in a tryLock’s set, a tryLock will succeed in acquiring its locks and running the code with probability at least$$1/(\kappa L)$$ $1 / (κ L)$ . It is thus fair. Furthermore, if the maximum step complexity for the code in any lock isT, the operation will take$$O(\kappa ^2 L^2 T)$$ $O (κ^{2} L^{2} T)$ steps, regardless of whether it succeeds or fails. The operations are independent, thus if the tryLock is repeatedly retried on failure, it will succeed in$$O(\kappa ^3 L^3 T)$$ $O (κ^{3} L^{3} T)$ expected steps. If the algorithm does not know the bounds$$\kappa $$ $κ$ andL, we present a variant that can guarantee a probability of at least$$1/\kappa L\log (\kappa L T)$$ $1 / κ L log (κ L T)$ of success. We assume an oblivious adversarial scheduler, which does not make decisions based on the operations, but can predetermine any schedule for the processes, which is unknown to our algorithm. Furthermore, to account for applications that change their future requests based on the results of previous tryLock operations, we strengthen the adversary by allowing decisions of the start times and lock sets of tryLock operations to be made adaptively, given the history of the execution so far.
more » « less
Free, publicly-accessible full text available March 1, 2026
Fast and Fair Randomized Wait-Free Locks

https://doi.org/10.1145/3519270.3538448

Ben-David, Naama; Blelloch, Guy E. (July 2022, Principles of Distributed Computing)

We present a randomized approach for wait-free locks with strong bounds on time and fairness in a context in which any process can be arbitrarily delayed. Our approach supports a tryLock operation that is given a set of locks, and code to run when all the locks are acquired. A tryLock operation, or attempt, may fail if there is contention on the locks, in which case the code is not run. Given an upper bound κ known to the algorithm on the point contention of any lock, and an upper bound L on the number of locks in a try- Lock’s set, a tryLock will succeed in acquiring its locks and running the code with probability at least 1/(κL). It is thus fair. Furthermore, if the maximum step complexity for the code in any lock is T , the attempt will take O(κ^2L^2T ) steps, regardless of whether it succeeds or fails. The attempts are independent, thus if the tryLock is repeat- edly retried on failure, it will succeed in O(κ^3L^3T ) expected steps, and with high probability in not much more.
more » « less
Lock-free locks revisited

https://doi.org/10.1145/3503221.3508433

Ben-David, Naama; Blelloch, Guy E.; Wei, Yuanhao (March 2022, Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming)

This paper presents a new and practical approach to lock-free locks based on helping, which allows the user to write code using fine-grained locks, but run it in a lock-free manner. Although lock-free locks have been suggested in the past, they are widely viewed as impractical, have some key limitations, and, as far as we know, have never been implemented. The paper presents some key techniques that make lock-free locks practical and more general. The most important technique is an approach to idempotence—i.e. making code that runs multiple times appear as if it ran once. The idea is based on using a shared log among processes running the same protected code. Importantly, the approach can be library based, requiring very little if any change to standard code—code just needs to use the idempotent versions of memory operations (load, store, LL/SC, allocation, free). We have implemented a C++ library called Flock based on the ideas. Flock allows lock-based data structures to run in either lock-free or blocking (traditional locks) mode. We implemented a variety of tree and list-based data structures with Flock and compare the performance of the lock-free and blocking modes under a variety of workloads. The lock-free mode is almost as fast as blocking mode under almost all workloads, and significantly faster when threads are oversubscribed (more threads than processors). We also compare with several existing lock-based and lock-free alternatives.
more » « less
FliT: a library for simple and efficient persistent algorithms

https://doi.org/10.1145/3503221.3508436

Wei, Yuanhao; Ben-David, Naama; Friedman, Michal; Blelloch, Guy E.; Petrank, Erez (March 2022, Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming)

Non-volatile random access memory (NVRAM) offers byte-addressable persistence at speeds comparable to DRAM. However, with caches remaining volatile, automatic cache evictions can reorder updates to memory, potentially leaving persistent memory in an inconsistent state upon a system crash. Flush and fence instructions can be used to force ordering among updates, but are expensive. This has motivated significant work studying how to write correct and efficient persistent programs for NVRAM. In this paper, we present FliT, a C++ library that facilitates writing efficient persistent code. Using the library's default mode makes any linearizable data structure durable with minimal changes to the code. FliT avoids many redundant flush instructions by using a novel algorithm to track dirty cache lines. It also allows for extra optimizations, but achieves good performance even in its default setting. To describe the FliT library's capabilities and guarantees, we define a persistent programming interface, called the P-V Interface, which FliT implements. The P-V Interface captures the expected behavior of code in which some instructions' effects are persisted and some are not. We show that the interface captures the desired semantics of many practical algorithms in the literature. We apply the FliT library to four different persistent data structures, and show that across several workloads, persistence implementations, and data structure sizes, the FliT library always improves operation throughput, by at least 2.1X over a naive implementation in all but one workload.
more » « less
Constant-time snapshots with applications to concurrent data structures

https://doi.org/10.1145/3437801.3441602

Wei, Yuanhao; Ben-David, Naama; Blelloch, Guy E.; Fatourou, Panagiota; Ruppert, Eric; Sun, Yihan (February 2021, ACM/SIGPLAN Conference on Principles and Practice of Parallel Programming)
null (Ed.)
Full Text Available
NVTraverse: in NVRAM data structures, the destination is more important than the journey

https://doi.org/10.1145/3385412.3386031

Friedman, Michal; Ben-David, Naama; Wei, Yuanhao; Blelloch, Guy E.; Petrank, Erez (June 2020, ACM Conference on Programming Languages and Implementation (PLDI))
null (Ed.)
Full Text Available
Implicit Decomposition for Write-Efficient Connectivity Algorithms

https://doi.org/10.1109/IPDPS.2018.00081

Ben-David, Naama; Blelloch, Guy; Fineman, Jeremy; Gibbons, Phillip; Gu, Yan; McGuffey, Charles; Shun, Julian (May 2018, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS))

Full Text Available
Parallel Algorithms for Asymmetric Read-Write Costs

https://doi.org/10.1145/2935764.2935767

Ben-David, Naama; Blelloch, Guy E; Fineman, Jeremy T; Gibbons, Phillip B; Gu, Yan; McGuffey, Charles; Shun, Julian (July 2016, 28th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA))

Motivated by the significantly higher cost of writing than reading in emerging memory technologies, we consider parallel algorithm design under such asymmetric read-write costs, with the goal of reducing the number of writes while preserving work-efficiency and low span. We present a nested-parallel model of computation that combines (i) small per-task stack-allocated memories with symmetric read-write costs and (ii) an unbounded heap-allocated shared memory with asymmetric read-write costs, and show how the costs in the model map efficiently onto a more concrete machine model under a work-stealing scheduler. We use the new model to design reduced write, work-efficient, low span parallel algorithms for a number of fundamental problems such as reduce, list contraction, tree contraction, breadth-first search, ordered filter, and planar convex hull. For the latter two problems, our algorithms are output-sensitive in that the work and number of writes decrease with the output size. We also present a reduced write, low span minimum spanning tree algorithm that is nearly work-efficient (off by the inverse Ackermann function). Our algorithms reveal several interesting techniques for significantly reducing shared memory writes in parallel algorithms without asymptotically increasing the number of shared memory reads.
more » « less
Full Text Available

Search for: All records