skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 10:00 PM ET on Friday, December 8 until 2:00 AM ET on Saturday, December 9 due to maintenance. We apologize for the inconvenience.


Search for: All records

Award ID contains: 1704742

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Oblivious routing has a long history in both the theory and practice of networking. In this work we initiate the formal study of oblivious routing in the context of reconfigurable networks, a new architecture that has recently come to the fore in datacenter networking. These networks allow a rapidly changing bounded-degree pattern of interconnections between nodes, but the network topology and the selection of routing paths must both be oblivious to the traffic demand matrix. Our focus is on the trade-off between maximizing throughput and minimizing latency in these networks. For every constant throughput rate, we characterize (up to a constant factor) the minimum latency achievable by an oblivious reconfigurable network design that satisfies the given throughput guarantee. The trade-off between these two objectives turns out to be surprisingly subtle: the curve depicting it has an unexpected scalloped shape reflecting the fact that load-balancing becomes more difficult when the average length of routing paths is not an integer because equalizing all the path lengths is not possible. The proof of our lower bound uses LP duality to verify that Valiant load balancing is the most efficient oblivious routing scheme when used in combination with an optimally-designed reconfigurable network topology. The proof of our upper bound uses an algebraic construction in which the network nodes are identified with vectors over a finite field, the network topology is described by either the elementary basis or a sequence of Vandermonde matrices, and routing paths are constructed by selecting columns of these matrices to yield the appropriate mixture of path lengths within the shortest possible time interval. 
    more » « less
  2. null (Ed.)
    This paper demonstrates that it is possible to achieve µs-scale latency using Linux kernel storage stack, even when tens of latency-sensitive applications compete for host resources with throughput-bound applications that perform read/write operations at throughput close to hardware capacity. Furthermore, such performance can be achieved without any modification in applications, network hardware, kernel CPU schedulers and/or kernel network stack. We demonstrate the above using design, implementation and evaluation of blk-switch, a new Linux kernel storage stack architecture. The key insight in blk-switch is that Linux’s multi-queue storage design, along with multi-queue network and storage hardware, makes the storage stack conceptually similar to a network switch. blk-switch uses this insight to adapt techniques from the computer networking literature (e.g., multiple egress queues, prioritized processing of individual requests, load balancing, and switch scheduling) to the Linux kernel storage stack. blk-switch evaluation over a variety of scenarios shows that it consistently achieves µs-scale average and tail latency (at both 99th and 99.9th percentiles), while allowing applications to near-perfectly utilize the hardware capacity. 
    more » « less
  3. null (Ed.)
    Traditional end-host network stacks are struggling to keep up with rapidly increasing datacenter access link bandwidths due to their unsustainable CPU overheads. Motivated by this, our community is exploring a multitude of solutions for future network stacks: from Linux kernel optimizations to partial hardware o!oad to clean-slate userspace stacks to specialized host network hardware. The design space explored by these solutions would bene"t from a detailed understanding of CPU ine#ciencies in existing network stacks. This paper presents measurement and insights for Linux kernel network stack performance for 100Gbps access link bandwidths. Our study reveals that such high bandwidth links, coupled with relatively stagnant technology trends for other host resources (e.g., core speeds and count, cache sizes, NIC bu$er sizes, etc.), mark a fundamental shift in host network stack bottlenecks. For instance, we "nd that a single core is no longer able to process packets at line rate, with data copy from kernel to application bu$ers at the receiver becoming the core performance bottleneck. In addition, increase in bandwidth-delay products have outpaced the increase in cache sizes, resulting in ine#cient DMA pipeline between the NIC and the CPU. Finally, we "nd that traditional loosely-coupled design of network stack and CPU schedulers in existing operating systems becomes a limiting factor in scaling network stack performance across cores. Based on insights from our study, we discuss implications to design of future operating systems, network protocols, and host hardware. 
    more » « less
  4. null (Ed.)
    This paper presents CodedBulk, a system for high-throughput inter-datacenter bulk transfers. At its core, CodedBulk uses network coding, a technique from the coding theory community, that guarantees optimal throughput for individual bulk transfers. Prior attempts to using network coding in wired networks have faced several pragmatic and fundamental barriers. CodedBulk resolves these barriers by exploiting the unique properties of inter-datacenter networks, and by using a custom-designed hop-by-hop flow control mechanism that enables efficient realization of network coding atop existing transport protocols. An end-to end CodedBulk implementation running on a geo-distributed inter-datacenter network improves bulk transfer throughput by 1.2−2.5× compared to state-of-the-art mechanisms that do not use network coding. 
    more » « less
  5. Cloud applications are increasingly relying on hundreds of loosely-coupled microservices to complete user requests that meetan application’s end-to-end QoS requirements. Communication time between services accounts for a large fraction of the end-to-endlatency and can introduce performance unpredictability and QoS violations. This work presents our early work onDagger, a hardwareacceleration platform for networking, designed specifically with the unique qualities of microservices in mind. The Dagger architecturerelies on an FPGA-based NIC, closely coupled with the processor over a configurable memory interconnect, designed to offload andaccelerate RPC stacks. Unlike the traditional cloud systems that use PCIe links as the NIC I/O interface, we leverage memory-interconnectedFPGAs as networking devices to provide the efficiency, transparency, and programmability needed for fine-grained microservices. We showthat this considerably improves CPU utilization and performance for cloud RPCs. 
    more » « less
  6. We present PANCAKE, the first system to protect key-value stores from access pattern leakage attacks with small constant factor bandwidth overhead. PANCAKE uses a new approach, that we call frequency smoothing, to transform plaintext ac- cesses into uniformly distributed encrypted accesses to an encrypted data store. We show that frequency smoothing prevents access pattern leakage attacks by passive persis- tent adversaries in a new formal security model. We inte- grate PANCAKE into three key-value stores used in produc- tion clusters, and demonstrate its practicality: on standard benchmarks, PANCAKE achieves 229× better throughput than non-recursive Path ORAM — within 3–6× of insecure base- lines for these key-value stores. 
    more » « less
  7. Cloud applications are increasingly shifting to interactive and loosely-coupled microservices. Despite their advantages, microservices complicate resource management, due to inter-tier dependencies. We present Sinan and PuppetMaster, two cluster managers for interactive microservices that leverages easily-obtainable tracing data instead of empirical decisions, to infer the impact of a resource allocation on end-to-end performance, and allocate appropriate resources to each tier. In a preliminary evaluation of the system with an end-to-end social network built with microservices, we show that the cluster manager's data-driven approach allows the service to always meet its QoS without sacrificing resource efficiency. 
    more » « less
  8. We present operational experience running Snowflake, a cloud- based data warehousing system with SQL support similar to state-of-the-art databases. Snowflake design is motivated by three goals: (1) compute and storage elasticity; (2) support for multi-tenancy; and, (3) high performance. Over the last few years, Snowflake has grown to serve thousands of customers executing millions of queries on petabytes of data every day. We discuss Snowflake design with a particular focus on ephemeral storage system design, query scheduling, elastic- ity and efficiently supporting multi-tenancy. Using statistics collected during execution of 70 million queries over a 14 day period, our study highlights how recent changes in cloud infrastructure have altered the many assumptions that guided the design and optimization of Snowflake, and outlines several interesting avenues of future research. 
    more » « less
  9. This paper presents design, implementation and evaluation of i10, a new remote storage stack implemented entirely in the kernel. i10 runs on commodity hardware, allows unmodified applications to operate directly on kernel’s TCP/IP network stack, and yet, saturates a 100Gbps link for remote accesses using CPU utilization similar to state-of-the-art user-space and RDMA-based solutions. 
    more » « less