One-sided communication is a useful paradigm for irregular paral- lel applications, but most one-sided programming environments, including MPI’s one-sided interface and PGAS programming lan- guages, lack application-level libraries to support these applica- tions. We present the Berkeley Container Library, a set of generic, cross-platform, high-performance data structures for irregular ap- plications, including queues, hash tables, Bloom filters and more. BCL is written in C++ using an internal DSL called the BCL Core that provides one-sided communication primitives such as remote get and remote put operations. The BCL Core has backends for MPI, OpenSHMEM, GASNet-EX, and UPC++, allowing BCL data structures to be used natively in programs written using any of these programming environments. Along with our internal DSL, we present the BCL ObjectContainer abstraction, which allows BCL data structures to transparently serialize complex data types while maintaining efficiency for primitive types. We also introduce the set of BCL data structures and evaluate their performance across a number of high-performance computing systems, demonstrating that BCL programs are competitive with hand-optimized code, even while hiding many of the underlying details of message aggregation, serialization, and synchronization.
Obladi: Oblivious Serializable Transactions in the Cloud
This paper presents the design and implementation of Obladi, the first system to provide ACID transactions while also hiding access patterns. Obladi uses as its building block oblivious RAM, but turns the demands of supporting transac- tions into a performance opportunity. By executing transac- tions within epochs and delaying commit decisions until an epoch ends, Obladi reduces the amortized bandwidth costs of oblivious storage and increases overall system through- put. These performance gains, combined with new oblivious mechanisms for concurrency control and recovery, allow Obladi to execute OLTP workloads with reasonable through- put: it comes within 5× to 12× of a non-oblivious baseline on the TPC-C, SmallBank, and FreeHealth applications. Latency overheads, however, are higher (70× on TPC-C).
- Award ID(s):
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’18)
- Sponsoring Org:
- National Science Foundation
More Like this
To achieve good performance, modern applications often partition and replicate their state across multiple geographically-distributed nodes. While this approach reduces latency in the common case, it can be challenging for programmers to use correctly, especially in applications that require strong consistency. We show how to achieve strong consistency while avoiding coordination by using predictive treaties, a mechanism that can significantly reduce distributed coordination without losing strong consistency. The central insight behind our approach is that many computations can be expressed in terms of predicates over distributed state that can be partitioned and enforced locally. Predictive treaties improve on previous work by allowing the locally enforced predicates to depend on time. Intuitively, by predicting the evolution of system state, coordination can be significantly reduced compared to static approaches. We implemented predictive treaties in a distributed system that exposes them via an intuitive programming model. We evaluate performance on several benchmarks, including TPC-C, showing that predictive treaties can significantly increase performance by orders of magnitude and can even outperform customized algorithms.
Adapting for the COVID-19 pandemic in Ecuador, a characterization of hospital strategies and patientsCalderaro, Adriana (Ed.)The World Health Organization (WHO) declared coronavirus disease-2019 (COVID-19) a global pandemic on 11 March 2020. In Ecuador, the first case of COVID-19 was recorded on 29 February 2020. Despite efforts to control its spread, SARS-CoV-2 overran the Ecuadorian public health system, which became one of the most affected in Latin America on 24 April 2020. The Hospital General del Sur de Quito (HGSQ) had to transition from a general to a specific COVID-19 health center in a short period of time to fulfill the health demand from patients with respiratory afflictions. Here, we summarized the implementations applied in the HGSQ to become a COVID-19 exclusive hospital, including the rearrangement of hospital rooms and a triage strategy based on a severity score calculated through an artificial intelligence (AI)-assisted chest computed tomography (CT). Moreover, we present clinical, epidemiological, and laboratory data from 75 laboratory tested COVID-19 patients, which represent the first outbreak of Quito city. The majority of patients were male with a median age of 50 years. We found differences in laboratory parameters between intensive care unit (ICU) and non-ICU cases considering C-reactive protein, lactate dehydrogenase, and lymphocytes. Sensitivity and specificity of the AI-assisted chest CT were 21.4% and 66.7%,more »
Non-Volatile Memory technologies are advancing rapidly and may augment or replace DRAM in future systems. However, a key question is how programmers will use them to construct and manipulate persistent data. One possible approach gives programmers direct access to persistent memory using relocatable persistent pools that hold persistent objects which can be accessed using persistent pointers, called ObjectIDs. Prior work has shown that hardware-supported address translation for ObjectIDs provides significant performance improvement and simplifies programming, however these works did not consider the large overheads incurred to check permissions before accessing persistent objects. In this paper, we identify permission checking in hardware as a critical mechanism that must be included when translating ObjectIDs to addresses in order to simplify programming and fully benefit from hardware translation. To support it, we add a System Persistent Object Table (SPOT) to support translation and permissions checks on ObjectIDs. The SPOT holds all known pools, their physical address, and their permissions information in memory. When a program attempts to access a persistent object, the SPOT is consulted and permissions are verified without trapping to the operating system. We have implemented our new design in a cycle accurate simulator and compared it with software only approachesmore »
Resource disaggregation is a new architecture for data centers in which resources like memory and storage are decoupled from the CPU, managed independently, and connected through a high-speed network. Recent work has shown that although disaggregated data centers (DDCs) provide operational benefits, applications running on DDCs experience degraded performance due to extra network latency between the CPU and their working sets in main memory. DBMSs are an interesting case study for DDCs for two main reasons: (1) DBMSs normally process data-intensive workloads and require data movement between different resource components; and (2) disaggregation drastically changes the assumption that DBMSs can rely on their own internal resource management. We take the first step to thoroughly evaluate the query execution performance of production DBMSs in disaggregated data centers. We evaluate two popular open-source DBMSs (MonetDB and PostgreSQL) and test their performance with the TPC-H benchmark in a recently released operating system for resource disaggregation. We evaluate these DBMSs with various configurations and compare their performance with that of single-machine Linux with the same hardware resources. Our results confirm that significant performance degradation does occur, but, perhaps surprisingly, we also find settings in which the degradation is minor or where DDCs actually improvemore »