This paper introduces Mako, a highly available, high- throughput, and horizontally scalable transactional key-value store. Mako performs strongly consistent geo-replication to maintain availability despite entire datacenter failures, uses multi-core machines for fast serializable transaction process- ing, and shards data to scale out. To achieve these properties, especially to overcome the overheads of distributed transac- tions in geo-replicated settings, Mako decouples transaction execution and replication. This enables Mako to run transactions speculatively and very fast, and replicate transactions in the background to make them fault-tolerant. The key innovation in Mako is the use of two-phase commit (2PC) speculatively to allow distributed transactions to proceed without having to wait for their decisions to be replicated, while also preventing unbounded cascading aborts if shards fail prior to the end of replication. Our experimental evaluation on Azure shows that Mako processes 3.66M TPC-C transactions per second when data is split across 10 shards, each of which runs with 24 threads. This is an 8.6×higher throughput than state-of-the-art systems optimized for geo-replication.
more »
« less
Check-Wait-Pounce: Increasing Transactional Data Structure Throughput by Delaying Transactions
Transactional data structures allow data structures to support transactional execution, in which a sequence of operations appears to execute atomically. We consider a paradigm in which a transaction commits its changes to the data structure only if all of its operations succeed; if one operation fails, then the transaction aborts. In this work, we introduce an optimization technique called Check-Wait-Pounce that increases performance by avoiding aborts that occur due to failed operations. Check-Wait-Pounce improves upon existing methodologies by delaying the execution of transactions until they are expected to succeed, using a thread-unsafe representation of the data structure as a heuristic. Our evaluation reveals that Check-Wait-Pounce reduces the number of aborts by an average of 49.0%. Because of this reduction in aborts, the tested transactional linked lists achieve average gains in throughput of 2.5x, while some achieve gains as high as 4x.
more »
« less
- PAR ID:
- 10105856
- Date Published:
- Journal Name:
- IFIP International Conference on Distributed Applications and Interoperable Systems
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This paper introduces Mako, a highly available, highthroughput, and horizontally scalable transactional key-value store. Mako performs strongly consistent geo-replication to maintain availability despite entire datacenter failures, uses multi-core machines for fast serializable transaction processing, and shards data to scale out. To achieve these properties, especially to overcome the overheads of distributed transactions in geo-replicated settings, Mako decouples transaction execution and replication. This enables Mako to run transactions speculatively and very fast, and replicate transactions in the background to make them fault-tolerant. The key innovation in Mako is the use of two-phase commit (2PC) speculatively to allow distributed transactions to proceed without having to wait for their decisions to be replicated, while also preventing unbounded cascading aborts if shards fail prior to the end of replication. Our experimental evaluation on Azure shows that Mako processes 3.66M TPC-C transactions per second when data is split across 10 shards, each of which runs with 24 threads. This is an 8.6× higher throughput than state-of-the-art systems optimized for geo-replication.more » « less
-
Transactional memory has been receiving much attention from both academia and industry. In transactional memory, program code is split into transactions, blocks of code that appear to execute atomically. Transactions are executed speculatively and the speculative execution is supported through data versioning mechanism. Lazy versioning makes aborts fast but penalizes commits, whereas eager versioning makes commits fast but penalizes aborts. However, whether to use eager or lazy versioning to execute those transactions is still a hotly debated topic. Lazy versioning seems appropriate for write-dominated workloads and transactions in high contention scenarios whereas eager versioning seems appropriate for read-dominated workloads and transactions in low contention scenarios. This necessitates a priori knowledge on the workload and contention scenario to select an appropriate versioning method to achieve better performance. In this article, we present an adaptive versioning approach, called Adaptive, that dynamically switches between eager and lazy versioning at runtime, without the need of a priori knowledge on the workload and contention scenario but based on appropriate system parameters, so that the performance of a transactional memory system is always better than that is obtained using either eager or lazy versioning individually. We provide Adaptive for both persistent and non-persistent transactional memory systems using performance parameters appropriate for those systems. We implemented our adaptive versioning approach in the latest software transactional memory distribution TinySTM and extensively evaluated it through 5 micro-benchmarks and 8 complex benchmarks from STAMP and STAMPEDE suites. The results show significant benefits of our approach. Specifically, in persistent TM systems, our approach achieved performance improvements as much as 1.5× for execution time and as much as 240× for number of aborts, whereas our approach achieved performance improvements as much as 6.3× for execution time and as much as 170× for number of aborts in non-persistent transactional memory systems.more » « less
-
null (Ed.)The Transactional Data Structure Library (TDSL) methodology improves the programmability and performance of concurrent software by making it possible for programmers to compose multiple concurrent data structure operations into coarse-grained transactions. Like transactional memory, TDSL enables arbitrarily many operations on arbitrarily many data structures to appear to other threads as a single atomic, isolated transaction. Like concurrent data structures, the individual operations on a TDSL data structure are optimized to avoid artificial contention. We introduce techniques for reducing false conflicts in TDSL implementations. Our approach allows expressing the postconditions of operations entirely via semantic properties, instead of through low-level structural properties. Our design is general enough to support lists, deques, ordered and unordered maps, and vectors. It supports richer programming interfaces than are available in existing TDSL implementations. It is also capable of precise memory management, which is necessary in low-level languages like C++.more » « less
-
CockroachDB is an open-source database, providing transactional access to data in a distributed setting. CockroachDB employs a multi-version timestamp ordering protocol to provide serializability. This provides a simple mechanism to enforce serializability, but the static timestamp allocation scheme can lead to a high number of aborts under contention. We aim to reduce the aborts for transactional workloads by integrating a dynamic timestamp ordering based concurrency control scheme in CockroachDB. Dynamic timestamp ordering scheme tries to reduce the number of aborts by allocating timestamps dynamically based on the conflicts of accessed data items. This gives a transaction higher chance to fit on a logically serializable timeline, especially in workloads with high contention.more » « less
An official website of the United States government

