skip to main content


Title: SNOW Revisited: Understanding When IdealREAD Transactions Are Possible
Abstract—READ transactions that read data distributed across servers dominate the workloads of real-world distributed storage systems. The SNOW Theorem [13] stated that ideal READ transactions that have optimal latency and the strongest guarantees—i.e., “SNOW” READ transactions—are impossible in one specific setting that requires three or more clients: at least two readers and one writer. However, it left many open questions.We close all of these open questions with new impossibility results and new algorithms. First, we prove rigorously the result from [13] saying that it is impossible to have a READ transactions system that satisfies SNOW properties with three or more clients.The insight we gained from this proof led to teasing out the implicit assumptions that are required to state the results and also, resolving the open question regarding the possibility of SNOW with two clients. We show that it is possible to design an algorithm, where SNOW is possible in a multi-writer, single-reader (MWSR) setting when a client can send messages to other clients; on the other hand, we prove it is impossible to implement SNOW in a multi-writer, single-reader (MWSR) setting–which is more general than the two-client setting–when client-to-client communication is disallowed. We also correct the previous claim in [13] that incorrectly identified one existing system, Eiger [12], as supporting the strongest guarantees (SW)and whose read-only transactions had bounded latency. Thus,there were no previous algorithms that provided the strongest guarantees and had bounded latency. Finally, we introduce the first two algorithms to provide the strongest guarantees with bounded latency  more » « less
Award ID(s):
2003830
NSF-PAR ID:
10249610
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
35th IEEE International Parallel and Distributed Processing Symposium (IPDPS)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Shared register emulations on top of message- passing systems provide an illusion of a simpler shared memory system which can make the task of a system designer easier. Numerous shared register applications have a considerably high read to write ratio. Thus having algorithms that make reads more efficient than writes is a fair trade-off. Typically such algorithms for reads and writes are asymmetric and sacrifice the stringent consistency condition atomicity as it is impossible to have fast reads for multi-writer atomicity. Safety is a consistency condition has has gathered interest from both the systems and theory community as it is weaker than atomicity yet provides strong enough guarantees like “strong consistency” or read-my-write consistency. One requirement that is assumed by many researchers is that of the reliable broadcast (RB) primitive, which ensures the all or none property during a broadcast. One drawback is that such a primitive takes 1.5 rounds to complete. This paper implements an efficient multi-writer multi-reader safe register without using a reliable broadcast primitive. More- over, we provide fast reads or one-shot reads – our read operation can be completed in one round of client-to-server communication. Of course, this comes with the price of requiring more servers when compared to prior solutions assuming reliable broadcast. However, we show that this increased number of servers is indeed necessary as we prove a tight bound on the number of servers required to implement Byzantine-fault tolerant safe registers in a system without reliable broadcast. We extend our results to data stored using erasure coding as well. We present an emulation of single-writer multi-reader safe register based on MDS code. The usage of MDS code reduces storage cost and communication cost. On the negative side, we also show that to use MDS code and achieve one-shot read at the same time, we need even more servers. 
    more » « less
  2. null (Ed.)
    Read-only transactions are critical for consistently reading data spread across a distributed storage system but have worse performance than simple, non-transactional reads. We identify three properties of simple reads that are necessary for read-only transactions to be performance-optimal, i.e.,come as close as possible to simple reads. We demonstrate a fundamental tradeoff in the design of read-only transactions by proving that performance optimality is impossible to achieve with strict serializability, the strongest consistency.Guided by this result, we present PORT, a performance-optimal design with the strongest consistency to date. Central to PORT are version clocks, a specialized logical clock that concisely captures the necessary ordering constraints.We show the generality of PORT with two applications.Scylla-PORT provides process-ordered serializability with simple writes and shows performance comparable to its non-transactional base system. Eiger-PORT provides causal consistency with write transactions and significantly improves the performance of its transactional base system. 
    more » « less
  3. null (Ed.)
    Abstract The Bitcoin network has offered a new way of securely performing financial transactions over the insecure network. Nevertheless, this ability comes with the cost of storing a large (distributed) ledger, which has become unsuitable for personal devices of any kind. Although the simplified payment verification (SPV) clients can address this storage issue, a Bitcoin SPV client has to rely on other Bitcoin nodes to obtain its transaction history and the current approaches offer no privacy guarantees to the SPV clients. This work presents T 3 , a trusted hardware-secured Bitcoin full client that supports efficient oblivious search/update for Bitcoin SPV clients without sacrificing the privacy of the clients. In this design, we leverage the trusted execution and attestation capabilities of a trusted execution environment (TEE) and the ability to hide access patterns of oblivious random access machine (ORAM) to protect SPV clients’ requests from potentially malicious nodes. The key novelty of T 3 lies in the optimizations introduced to conventional ORAM, tailored for expected SPV client usages. In particular, by making a natural assumption about the access patterns of SPV clients, we are able to propose a two-tree ORAM construction that overcomes the concurrency limitation associated with traditional ORAMs. We have implemented and tested our system using the current Bitcoin Unspent Transaction Output (UTXO) Set. Our experiment shows that T 3 is feasible to be deployed in practice while providing strong privacy and security guarantees to Bitcoin SPV clients. 
    more » « less
  4. We study exploration in stochastic multi-armed bandits when we have access to a divisible resource that can be allocated in varying amounts to arm pulls. We focus in particular on the allocation of distributed computing resources, where we may obtain results faster by allocating more resources per pull, but might have reduced throughput due to nonlinear scaling. For example, in simulation-based scientific studies, an expensive simulation can be sped up by running it on multiple cores. This speed-up however, is partly offset by the communication among cores, which results in lower throughput than if fewer cores were allocated to run more trials in parallel. In this paper, we explore these trade-offs in two settings. First, in a fixed confidence setting, we need to find the best arm with a given target success probability as quickly as possible. We propose an algorithm which trades off between information accumulation and throughput and show that the time taken can be upper bounded by the solution of a dynamic program whose inputs are the gaps between the sub-optimal and optimal arms. We also prove a matching hardness result. Second, we present an algorithm for a fixed deadline setting, where we are given a time deadline and need to maximize the probability of finding the best arm. We corroborate our theoretical insights with simulation experiments that show that the algorithms consistently match or outperform baseline algorithms on a variety of problem instances. 
    more » « less
  5. A central theme in federated learning (FL) is the fact that client data distributions are often not independent and identically distributed (IID), which has strong implications on the training process. While most existing FL algorithms focus on the conventional non-IID setting of class imbalance or missing classes across clients, in practice, the distribution differences could be more complex, e.g., changes in class conditional (domain) distributions. In this paper, we consider this complex case in FL wherein each client has access to only one domain distribution. For tasks such as domain generalization, most existing learning algorithms require access to data from multiple clients (i.e., from multiple domains) during training, which is prohibitive in FL. To address this challenge, we propose a federated domain translation method that generates pseudodata for each client which could be useful for multiple downstream learning tasks. We empirically demonstrate that our translation model is more resource-efficient (in terms of both communication and computation) and easier to train in an FL setting than standard domain translation methods. Furthermore, we demonstrate that the learned translation model enables use of state-of-the-art domain generalization methods in a federated setting, which enhances accuracy and robustness to increases in the synchronization period compared to existing methodology. 
    more » « less