NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

RDMA is Turing complete, we just did not know it yet!

Reda, Waleed; Kostić, Dejan; Canini, Marco; Peter, Simon (January 2022, 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI))

It is becoming increasingly popular for distributed systems to exploit offload to reduce load on the CPU. Remote Direct Memory Access (RDMA) offload, in particular, has become popular. However, RDMA still requires CPU intervention for complex offloads that go beyond simple remote memory access. As such, the offload potential is limited and RDMA-based systems usually have to work around such limitations. We present RedN, a principled, practical approach to implementing complex RDMA offloads, without requiring any hardware modifications. Using self-modifying RDMA chains, we lift the existing RDMA verbs interface to a Turing complete set of programming abstractions. We explore what is possible in terms of offload complexity and performance with a commodity RDMA NIC. We show how to integrate these RDMA chains into applications, such as the Memcached key-value store, allowing us to offload complex tasks such as key lookups. RedN can reduce the latency of key-value get operations by up to 2.6× compared to state-of-the-art KV designs that use one-sided RDMA primitives (e.g., FaRM-KV), as well as traditional RPC-over-RDMA approaches. Moreover, compared to these baselines, RedN provides performance isolation and, in the presence of contention, can reduce latency by up to 35× while providing applications with failure resiliency to OS and process crashes.
more » « less
Full Text Available
Tardis: A Fault-Tolerant Design for Network Control Planes

https://doi.org/10.1145/3482898.3483355

Zhou, Zhenyu; Benson, Theophilus A.; Canini, Marco; Chandrasekaran, Balakrishnan (October 2021, In The ACM SIGCOMM Symposium on SDN Research)

Full Text Available
LineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism

https://doi.org/10.1145/3477132.3483565

Kim, Jongyul; Jang, Insu; Reda, Waleed; Im, Jaeseong; Canini, Marco; Kostić, Dejan; Kwon, Youngjin; Peter, Simon; Witchel, Emmett (October 2021, Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles)

Full Text Available
Assise: performance and availability via client-local NVM in a distributed file system

Anderson, Thomas E; Canini, Marco; Kim, Jongyul; Kostic, Dejan; Kwon, Youngjin; Peter, Simon; Reda, Waleed; Schuh, Henry N; Witchel, Emmett (November 2020, OSDI'20: Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation)

The adoption of low latency persistent memory modules (PMMs) upends the long-established model of remote storage for distributed file systems. Instead, by colocating computation with PMM storage, we can provide applications with much higher IO performance, sub-second application failover, and strong consistency. To demonstrate this, we built the Assise distributed file system, based on a persistent, replicated coherence protocol that manages client-local PMM as a linearizable and crash-recoverable cache between applications and slower (and possibly remote) storage. Assise maximizes locality for all file IO by carrying out IO on process-local, socket-local, and client-local PMM whenever possible. Assise minimizes coherence overhead by maintaining consistency at IO operation granularity, rather than at fixed block sizes. We compare Assise to Ceph/BlueStore, NFS, and Octopus on a cluster with Intel Optane DC PMMs and SSDs for common cloud applications and benchmarks, such as LevelDB, Postfix, and FileBench. We find that Assise improves write latency up to 22×, throughput up to 56×, fail-over time up to 103×, and scales up to 6× better than its counterparts, while providing stronger consistency semantics.
more » « less
Full Text Available
Assise: Performance and Availability via Client-local NVM in a Distributed File System

Anderson, Thomas; Canini, Marco; Kim, Jongyul; Kostic, Dejan; Kwon, Youngjin; Peter, Simon; Reda, Waleed; Schuh, Henry; Witchel, Emmett (January 2020, 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20))
null (Ed.)
The adoption of low latency persistent memory modules (PMMs) upends the long-established model of remote storage for distributed file systems. Instead, by colocating computation with PMM storage, we can provide applications with much higher IO performance, sub-second application failover, and strong consistency. To demonstrate this, we built the Assise distributed file system, based on a persistent, replicated coherence protocol that manages client-local PMM as a linearizable and crash-recoverable cache between applications and slower (and possibly remote) storage. Assise maximizes locality for all file IO by carrying out IO on process-local, socket-local, and client-local PMM whenever possible. Assise minimizes coherence overhead by maintaining consistency at IO operation granularity, rather than at fixed block sizes. We compare Assise to Ceph/BlueStore, NFS, and Octopus on a cluster with Intel Optane DC PMMs and SSDs for common cloud applications and benchmarks, such as LevelDB, Postfix, and FileBench. We find that Assise improves write latency up to 22x, throughput up to 56x, fail-over time up to 103x, and scales up to 6x better than its counterparts, while providing stronger consistency semantics.
more » « less
Full Text Available
Sonata: query-driven streaming network telemetry

https://doi.org/10.1145/3230543.3230555

Gupta, Arpit; Harrison, Rob; Canini, Marco; Feamster, Nick; Rexford, Jennifer; Willinger, Walter (August 2018, ACM SIGCOMM)

Full Text Available

Search for: All records