skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: TCP ≈ RDMA: CPU-efficient Remote Storage Access with i10
This paper presents design, implementation and evaluation of i10, a new remote storage stack implemented entirely in the kernel. i10 runs on commodity hardware, allows unmodified applications to operate directly on kernel’s TCP/IP network stack, and yet, saturates a 100Gbps link for remote accesses using CPU utilization similar to state-of-the-art user-space and RDMA-based solutions.  more » « less
Award ID(s):
1704742
PAR ID:
10189405
Author(s) / Creator(s):
Date Published:
Journal Name:
USENIX Symposium on Networked Systems Design and Implementation
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Quantum computing has become widely available to researchers via cloud-hosted devices with different technologies using a multitude of software development frameworks. The vertical stack behind such solutions typically features quantum language abstraction and high-level translation frameworks that tend to be open source, down to pulse-level programming. However, the lower-level mapping to the control electronics, such as controls for laser and microwave pulse generators, remains closed source for contemporary commercial cloud-hosted quantum devices. One exception is the ARTIQ (Advanced Real-Time Infrastructure for Quantum physics) open-source library for trapped-ion control electronics. This stack has been complemented by the Duke ARTIQ Extensions (DAX) to provide modularity and better abstraction. It, however, remains disconnected from the wealth of features provided by popular quantum computing languages. This paper contributes QisDAX, a bridge between Qiskit and DAX that fills this gap. QisDAX provides interfaces for Python programs written using IBM's Qiskit and transpiles them to the DAX abstraction. This allows users to generically interface to the ARTIQ control systems accessing trapped-ion quantum devices. Consequently, the algorithms expressed in Qiskit become available to an open-source quantum software stack. This provides the first open-source, end-to-end, full-stack pipeline for remote submission of quantum programs for trapped-ion quantum systems in a non-commercial setting. 
    more » « less
  2. Internet-of-Things devices such as autonomous vehicular sensors, medical devices, and industrial cyber-physical systems commonly rely on small, resource-constrained microcontrollers (MCUs). MCU software is typically written in C and is prone to memory safety vulnerabilities that are exploitable by remote attackers to launch code reuse attacks and code/control data leakage attacks. We present Randezvous, a highly performant diversification-based mitigation to such attacks and their brute force variants on ARM MCUs. Atop code/data layout randomization and an efficient execute-only code approach, Randezvous creates decoy pointers to camouflage control data in memory; code pointers in the stack are then protected by a diversified shadow stack, local-to-global variable promotion, and return address nullification. Moreover, Randezvous adds a novel delayed reboot mechanism to slow down persistent attacks and mitigates control data spraying attacks via global guards. We demonstrate Randezvous’s security by statistically modeling leakage-equipped brute force attacks under Randezvous, crafting a proof-of-concept exploit that shows Randezvous’s efficacy, and studying a real-world CVE. Our evaluation of Randezvous shows low overhead on three benchmark suites and two applications. 
    more » « less
  3. Distributed applications enhance their execution by using remote resources. However, distributed execution incurs communication, synchronization, fault-handling, and security overheads. If these overheads are not offset by the yet larger execution enhancement, distribution becomes counterproductive. For maximum benefits, the distribution’s granularity cannot be too fine or too crude; it must be just right. In this paper, we present a novel approach to re-architecting distributed applications, whose distribution granularity has turned ill-conceived. To adjust the distribution of such applications, our approach automatically reshapes their remote invocations to reduce aggregate latency and resource consumption. To that end, our approach insources a remote functionality for local execution, splits it into separate functions to profile their performance, and determines the optimal redistribution based on a cost function. Redistribution strategies combine separate functions into single remotely invocable units. To automate all the required program transformations, our approach introduces a series of domainspecific automatic refactorings. We have concretely realized our approach as an analysis and automatic program transformation infrastructure for the important domain of full-stack JavaScript applications, and evaluated its value, utility, and performance on a series of real-world cross-platform mobile apps. Our evaluation results indicate that our approach can become a useful tool for software developers charged with the challenges of re-architecting distributed applications. 
    more » « less
  4. Fluid-preserved reptile and amphibian specimens are challenging to photograph with traditional methods due to their complex three-dimensional forms and reflective surfaces when removed from solution. An effective approach to counteract these issues involves combining focus stack photography with the use of a photo immersion tank. Imaging specimens beneath a layer of preservative fluid eliminates glare and risk of specimen desiccation, while focus stacking produces sharp detail through merging multiple photographs taken at successive focal steps to create a composite image with an extended depth of field. This paper describes the wet imaging components and focus stack photography workflow developed while conducting a large-scale digitization project for targeted reptile and amphibian specimens housed in the University of Colorado Museum of Natural History Herpetology Collection. This methodology can be implemented in other collections settings and adapted for use with fluid-preserved specimen types across the Tree of Life to generate high-quality, taxonomically informative images for use in documenting biodiversity, remote examination of fine traits, inclusion in publications, and educational applications. 
    more » « less
  5. Achieving low remote memory access latency remains the primary challenge in realizing memory disaggregation over Ethernet within the datacenters. We present EDM that attempts to overcome this challenge using two key ideas. First, while existing network protocols for remote memory access over the Ethernet, such as TCP/IP and RDMA, are implemented on top of the Ethernet MAC layer, EDM takes a radical approach by implementing the entire network protocol stack for remote memory access within the Physical layer (PHY) of the Ethernet. This overcomes fundamental latency and bandwidth overheads imposed by the MAC layer, especially for small memory messages. Second, EDM implements a centralized, fast, in-network scheduler for memory traffic within the PHY of the Ethernet switch. Inspired by the classic Parallel Iterative Matching (PIM) algorithm, the scheduler dynamically reserves bandwidth between compute and memory nodes by creating virtual circuits in the PHY, thus eliminating queuing delay and layer 2 packet processing delay at the switch for memory traffic, while maintaining high bandwidth utilization. Our FPGA testbed demonstrates that EDM's network fabric incurs a latency of only ~300 ns for remote memory access in an unloaded network, which is an order of magnitude lower than state-of-the-art Ethernet-based solutions such as RoCEv2 and comparable to emerging PCIe-based solutions such as CXL. Larger-scale network simulations indicate that even at high network loads, EDM's average latency remains within 1.3x its unloaded latency. 
    more » « less