skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 5:00 PM ET until 11:00 PM ET on Friday, June 21 due to maintenance. We apologize for the inconvenience.

Title: TransKV: A Networking Support for Transaction Processing in Distributed Key-value Stores
Through the massive use of mobile devices, data clouds, and the rise of Internet of Things, enormous amount of data has been generated and analyzed for the benefit of society. NoSQL Databases and specially key-value stores be­ come the backbone in managing these large amounts of data. Most of key-value stores ignore transactions due to their ef­fect on degrading key-value store's performance. Meanwhile, programmable switches with the software-defined networks and the Programming Protocol-Independent Packet Processor (P4) lead to a programmable network where in-network computa­ tion can help accelerating the performance of applications. In this paper, we proposed a networking support for transaction processing in distributed key-value stores. Our system leverages the programmable switch to act as a transaction coordinator. Using a variation of the time stamp ordering concurrency control approach, the programmable switch can decide to proceed in transaction processing or abort the transaction directly from the network. Our experimental results on an initial prototype show that our proposed approach, while supporting transactions, improves the throughput by up to 4X and reduces the latency by 35% when compared to the existing architectures.  more » « less
Award ID(s):
Author(s) / Creator(s):
Date Published:
Journal Name:
IEEE International Conference on Big Data Computing Service and Applications (BigDataService)
Page Range / eLocation ID:
41 to 49
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Distributed key-value stores today require frequent key-value shard migration between nodes to react to dynamic workload changes for load balancing, data locality, and service elasticity. In this paper, we propose NetMigrate, a live migration approach for in-memory key-value stores based on programmable network data planes. NetMigrate migrates shards between nodes with zero service interruption and minimal performance impact. During migration, the switch data plane monitors the migration process in a fine-grained manner and directs client queries to the right server in real time, eliminating the overhead of pulling data between nodes. We implement a NetMigrate prototype on a testbed consisting of a programmable switch and several commodity servers running Redis and evaluate it under YCSB workloads. Our experiments demonstrate that NetMigrate improves the query throughput from 6.5% to 416% and maintains low access latency during migration, compared to the state-of-the-art migration approaches. 
    more » « less
  2. null (Ed.)
    Bitcoin and other altcoin cryptocurrencies use the Elliptic-Curve cryptography to control the ownership of coins. A user has one or more private keys to sign a transaction and send coins to others. The user locks her private keys with a password and stores them on a piece of software or a hardware wallet to protect them. A challenge in cryptocurrencies is losing access to private keys by its user, resulting in inaccessible coins. These coins are assigned to addresses which access to their private keys is impossible. Today, about 20 percent of all possible bitcoins are inaccessible and lost forever. A promising solution is the off-chain recovery transaction that aggregates all available coins to send them to an address when the private key is not accessible. Unfortunately, this recovery transaction must be regenerated after all sends and receives, and it is time-consuming to generate on hardware wallets. In this paper, we propose a new mechanism called lean recovery transaction to tackle this problem. We make a change in wallet key management to generate the recovery transaction as less frequently as possible. In our design, the wallet generates a lean recovery transaction only when needed and provides better performance, especially for micropayment. We evaluate the regular recovery transaction on two real hardware wallets and implement our proposed mechanism on a hardware wallet. We achieve a %40 percentage of less processing time for generating payment transactions with few numbers of inputs. The performance difference becomes even more significant, with a larger number of inputs. 
    more » « less
  3. Programmable networks support a wide variety of applications, including access control, routing, monitoring, caching, and synchronization. As demand for applications grows, so does resource contention within the switch data plane. Cramming applications onto a switch is a challenging task that often results in non-modular programming, frustrating “trial and error” compile-debug cycles, and suboptimal use of resources. In this paper, we present P4All, an extension of P4 that allows programmers to define elastic data structures that stretch automatically to make optimal use of available switch resources. These data structures are defined using symbolic primitives (that parameterize the size and shape of the structure) and objective functions (that quantify the value gained or lost as that shape changes). A top-level optimization function specifies how to share resources amongst data structures or applications. We demonstrate the inherent modularity and effectiveness of our design by building a range of reusable elastic data structures including hash tables, Bloom filters, sketches, and key-value stores, and using those structures within larger applications. We show how to implement the P4All compiler using a combination of dependency analysis, loop unrolling, linear and non-linear constraint generation, and constraint solving. We evaluate the compiler’s performance, showing that a range of elastic programs can be compiled to P4 in few minutes at most, but usually less. 
    more » « less
  4. null (Ed.)
    High performance distributed storage systems face the challenge of load imbalance caused by skewed and dynamic workloads. This paper introduces Pegasus, a new storage system that leverages new-generation programmable switch ASICs to balance load across storage servers. Pegasus uses selective replication of the most popular objects in the data store to distribute load. Using a novel in-network coherence directory, the Pegasus switch tracks and manages the location of replicated objects. This allows it to achieve load-aware forwarding and dynamic rebalancing for replicated keys, while still guaranteeing data coherence and consistency. The Pegasus design is practical to implement as it stores only forwarding metadata in the switch data plane. The resulting system improves the throughput of a distributed in-memory key-value store by more than 10x under a latency SLO -- results which hold across a large set of workloads with varying degrees of skew, read/write ratio, object sizes, and dynamism. 
    more » « less
  5. In this paper, we present GraphTM, an efficient and scalable framework for processing transactions in a distributed environment. The distributed environment is modeled as a graph where each node of the graph is a processing node that issues transactions. The objects that transactions use to execute are also on the graph nodes (the initial placement may be arbitrary). The transactions execute on the nodes which issue them after collecting all the objects that they need following the data-flow model of computation. This collection is done by issuing the requests for the objects as soon as transaction starts and wait until all required objects for the transaction come to the requesting node. The challenge is on how to schedule the transactions so that two crucial performance metrics, namely (i) total execution time to commit all the transactions, and (ii) total communication cost involved in moving the objects to the requesting nodes, are minimized. We implemented GraphTM in Java and assessed its performance through 3 micro-benchmarks and 5 complex benchmarks from STAMP benchmark suite on 5 different network topologies, namely, clique, line, grid, cluster, and star, that make an underlying communication network for a representative set of distributed systems commonly used in practice. The results show the efficiency and scalability of our approach. 
    more » « less