skip to main content

Search for: All records

Award ID contains: 1730628

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. ACM luminaries describe how their experiences with DEI issues vary between the different continents where they have lived.
    Free, publicly-accessible full text available December 1, 2023
  2. Digibox is a prototyping environment for IoT applications. It enables a novel scene-centric prototyping where developers can program an ensemble of simulated devices to capture not only their individual but also their coordinated behaviors, making it possible to test, debug, and evaluate the behaviors of an IoT application. Using Digibox, developers can download and reuse existing scenes, customize, and repurpose them towards developing new applications; or replicate others' experiment results from scientific research. Digibox's Kubernetes-based runtime further allows developers to easily scale the prototyping environment from a single laptop to a cluster running simulated devices and scenes at a scale appropriate to the application.
    Free, publicly-accessible full text available November 14, 2023
  3. Many systems today distribute trust across multiple parties such that the system provides certain security properties if a subset of the parties are honest. In the past few years, we have seen an explosion of academic and industrial cryptographic systems built on distributed trust, including secure multi-party computation applications (e.g., private analytics, secure learning, and private key recovery) and blockchains. These systems have great potential for improving security and privacy, but face a significant hurdle on the path to deployment. We initiate study of the following problem: a single organization is, by definition, a single party, and so how can a single organization build a distributed-trust system where corruptions are independent? We instead consider an alternative formulation of the problem: rather than ensuring that a distributed-trust system is set up correctly by design, what if instead, users can audit a distributed-trust deployment? We propose a framework that enables a developer to efficiently and cheaply set up any distributed-trust system in a publicly auditable way. To do this, we identify two application-independent building blocks that we can use to bootstrap arbitrary distributed-trust applications: secure hardware and an append-only log. We show how to leverage existing implementations of these building blocks tomore »deploy distributed-trust systems, and we give recommendations for infrastructure changes that would make it easier to deploy distributed-trust systems in the future.« less
    Free, publicly-accessible full text available November 14, 2023
  4. As distributed applications become increasingly complex, so do their scheduling requirements. This development calls for cluster schedulers that are not only general, but also evolvable. Unfortunately, most existing cluster schedulers are not evolvable: when confronted with new requirements, they need major rewrites to support these requirements. Examples include gang-scheduling support in Kubernetes [6, 39] or task-affinity in Spark [39]. Some cluster schedulers [14, 30] expose physical resources to applications to address this. While these approaches are evolvable, they push the burden of implementing scheduling mechanisms in addition to the policies entirely to the application. ESCHER is a cluster scheduler design that achieves both evolvability and application-level simplicity. ESCHER uses an abstraction exposed by several recent frameworks (which we call ephemeral resources) that lets the application express scheduling constraints as resource requirements. These requirements are then satisfied by a simple mechanism matching resource demands to available resources. We implement ESCHER on Kubernetes and Ray, and show that this abstraction can be used to express common policies offered by monolithic schedulers while allowing applications to easily create new custom policies hitherto unsupported.
    Free, publicly-accessible full text available November 7, 2023
  5. Conflict-free replicated data types (CRDTs) are a promising tool for designing scalable, coordination-free distributed systems. However, constructing correct CRDTs is difficult, posing a challenge for even seasoned developers. As a result, CRDT development is still largely the domain of academics, with new designs often awaiting peer review and a manual proof of correctness. In this paper, we present Katara, a program synthesis-based system that takes sequential data type implementations and automatically synthesizes verified CRDT designs from them. Key to this process is a new formal definition of CRDT correctness that combines a reference sequential type with a lightweight ordering constraint that resolves conflicts between non-commutative operations. Our process follows the tradition of work in verified lifting, including an encoding of correctness into SMT logic using synthesized inductive invariants and hand-crafted grammars for the CRDT state and runtime. Katara is able to automatically synthesize CRDTs for a wide variety of scenarios, from reproducing classic CRDTs to synthesizing novel designs based on specifications in existing literature. Crucially, our synthesized CRDTs are fully, automatically verified, eliminating entire classes of common errors and reducing the process of producing a new CRDT from a painstaking paper proof of correctness to a lightweight specification.
    Free, publicly-accessible full text available October 31, 2023
  6. Free, publicly-accessible full text available October 23, 2023
  7. Free, publicly-accessible full text available July 1, 2023
  8. Query optimizers are a performance-critical component in every database system. Due to their complexity, optimizers take experts months to write and years to refine. In this work, we demonstrate for the first time that learning to optimize queries without learning from an expert optimizer is both possible and efficient. We present Balsa, a query optimizer built by deep reinforcement learning. Balsa first learns basic knowledge from a simple, environment-agnostic simulator, followed by safe learning in real execution. On the Join Order Benchmark, Balsa matches the performance of two expert query optimizers, both open-source and commercial, with two hours of learning, and outperforms them by up to 2.8× in workload runtime after a few more hours. Balsa thus opens the possibility of automatically learning to optimize in future compute environments where expert-designed optimizers do not exist.
    Free, publicly-accessible full text available June 10, 2023
  9. The last decade has seen an explosion in the number of new secure multi-party computation (MPC) protocols that enable collaborative computation on sensitive data. No single MPC protocol is optimal for all types of computation. As a result, researchers have created hybrid-protocol compilers that translate a program into a hybrid protocol that mixes different MPC protocols. Hybrid-protocol compilers crucially rely on accurate cost models, which are handwritten by the compilers' developers, to choose the correct schedule of protocols. In this paper, we propose CostCO, the first automatic MPC cost modeling framework. CostCO develops a novel API to interface with a variety of MPC protocols, and leverages domain-specific properties of MPC in order to enable efficient and automatic cost-model generation for a wide range of MPC protocols. CostCO employs a two-phase experiment design to efficiently synthesize cost models of the MPC protocol's runtime as well as its memory and network usage. We verify CostCO's modeling accuracy for several full circuits, characterize the engineering effort required to port existing MPC protocols, and demonstrate how hybrid-protocol compilers can leverage CostCO's cost models.
    Free, publicly-accessible full text available June 1, 2023
  10. Applications today rely on cloud databases for storing and querying time-series data. While outsourcing storage is convenient, this data is often sensitive, making data breaches a serious concern. We present Waldo, a time-series database with rich functionality and strong security guarantees: Waldo supports multi-predicate filtering, protects data contents as well as query filter values and search access patterns, and provides malicious security in the 3-party honest-majority setting. In contrast, prior systems such as Timecrypt and Zeph have limited functionality and security: (1) these systems can only filter on time, and (2) they reveal the queried time interval to the server. Oblivious RAM (ORAM) and generic multiparty computation (MPC) are natural choices for eliminating leakage from prior work, but both of these are prohibitively expensive in our setting due to the number of roundtrips and bandwidth overhead, respectively. To minimize both, Waldo builds on top of function secret sharing, enabling Waldo to evaluate predicates non-interactively. We develop new techniques for applying function secret sharing to the encrypted database setting where there are malicious servers, secret inputs, and chained predicates. With 32-core machines, Waldo runs a query with 8 range predicates over 2 18 records in 3.03s, compared to 12.88s or anmore »MPC baseline and 16.56s for an ORAM baseline. Compared to Waldo, the MPC baseline uses 9−82× more bandwidth between servers (for different numbers of records), while the ORAM baseline uses 20−152× more bandwidth between the client and server(s) (for different numbers of predicates).« less
    Free, publicly-accessible full text available May 1, 2023