Multiple vendors have recently released SmartNICs that provide both special-purpose accelerators and programmable processing cores that allow increasingly sophisticated packet processing tasks to be offloaded from general-purpose CPUs. Indeed, leading data-center operators have designed and deployed SmartNICs at scale to support both network virtualization and application-specific tasks. Unfortunately, cloud providers have not yet opened up the full power of these devices to tenants, as current runtimes do not provide adequate isolation between individual applications running on the SmartNICs themselves. We introduce FairNIC, a system to provide performance isolation between tenants utilizing the full capabilities of a commodity SoC SmartNIC. We implement FairNIC on Cavium LiquidIO 2360s and show that we are able to isolate not only typical packet processing, but also prevent MIPS-core cache pollution and fairly share access to fixed-function hardware accelerators. We use FairNIC to implement NIC-accelerated OVS and key/value store applications and show that they both can cohabitate on a single NIC using the same port, where the performance of each is unimpacted by other tenants. We argue that our results demonstrate the feasibility of sharing SmartNICs among virtual tenants, and motivate the development of appropriate security isolation mechanisms.
more »
« less
Isolation Mechanisms for High-Speed Packet-Processing Pipelines
Data-plane programmability is now mainstream. As we find more use cases, deployments need to be able to run multiple packet-processing modules in a single device. These are likely to be developed by independent teams, either within the same organization or from multiple organizations. Therefore, we need isolation mechanisms to ensure that modules on the same device do not interfere with each other. This paper presents Menshen, an extension of the Reconfigurable Match Tables (RMT) pipeline that enforces isolation between different packet-processing modules. Menshen is comprised of a set of lightweight hardware primitives and an extension to the open source P4-16 reference compiler that act in conjunction to meet this goal. We have prototyped Menshen on two FPGA platforms (NetFPGA and Corundum). We show that our design provides isolation, and allows new modules to be loaded without impacting the ones already running. Finally, we demonstrate the feasibility of implementing Menshen on ASICs by using the FreePDK45nm technology library and the Synopsys DC synthesis software, showing that our design meets timing at a 1 GHz clock frequency and needs approximately 6% additional chip area. We have open sourced the code for Menshen’s hardware and software at https://isolation.quest/.
more »
« less
- Award ID(s):
- 2028832
- PAR ID:
- 10396612
- Date Published:
- Journal Name:
- NSDI 2022
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
As Machine Learning (ML) applications become pervasive and computer architects further integrate hardware support, the need to rapidly explore trade-offs between algorithms and hardware becomes pressing. While prior work on hardware accelerators has led to tremendous performance and energy improvements, it can be difficult to generalize these approaches without resorting to special-purpose tools or even languages. Through object-oriented design principles, we describe a general and reusable approach for generating parameterized neural network hardware. Specifically, we describe our experiences with high-level hardware design objects for building neural network hardware based on the open-source Python HDL, PyRTL. By thinking at a higher level of abstraction than simple “hardware modules,”, we open the door to a process by which hardware can be developed with software engineering principles. This creates new opportunities for a tight feedback loop between machine learning algorithm innovation and hardware design reality. Future works considering hardware development for ML applications can benefit from our work analyzing the costs and benefits of abstraction.more » « less
-
Packet scheduling determines the ordering of packets in a queuing data structure with respect to some ranking function that is mandated by a scheduling policy. It is the core component in many recent innovations to optimize network performance and utilization. Our focus in this paper is on the design and deployment of packet scheduling in soft-ware. Software schedulers have several advantages over hardware including shorter development cycle and flexibility in functionality and deployment location. We substantially improve current software packet scheduling performance,while maintaining flexibility, by exploiting underlying features of packet ranking; namely, packet ranks are integers and, at any point in time, fall within a limited range of values.We introduce Eiffel, a novel programmable packet scheduling system. At the core of Eiffel is an integer priority queue based on the Find First Set (FFS) instruction and designed to support a wide range of policies and ranking functions efficiently. As an even more efficient alternative, we also pro-pose a new approximate priority queue that can outperform FFS-based queues for some scenarios. To support flexibility,Eiffel introduces novel programming abstractions to express scheduling policies that cannot be captured by current, state-of-the-art scheduler programming models. We evaluate Eiffel in a variety of settings and in both kernel and userspace deployments. We show that it outperforms state of the art systems by 3-40x in terms of either number of cores utilized for network processing or number of flows given fixed processing capacitymore » « less
-
The ever-increasing demand for high-performance in the time-critical embedded domain has pushed the adoption of powerful yet unpredictable heterogeneous Systems-on-a-Chip. The shared memory subsystem, which is known to be a major source of unpredictability, has been extensively studied, and many mitigation techniques have been proposed. Among them, performance-counter-based regulation techniques have seen widespread adoption. However, the problem of combining performance-based regulation with time-domain isolation has not received enough attention. In this article, we discuss our current work-in-progress on SHCReg (Software Hardware Co-design Regulator). First, we assess the limitations and benefits of combined CPU and memory budgeting. Next, we outline a full-stack hardware/software co-design architecture that aims at improving the interplay between CPU and memory isolation for mixed-criticality tasks running on the same core.more » « less
-
Abstract Objective.Neural activity represents a functional readout of neurons that is increasingly important to monitor in a wide range of experiments. Extracellular recordings have emerged as a powerful technique for measuring neural activity because these methods do not lead to the destruction or degradation of the cells being measured. Current approaches to electrophysiology have a low throughput of experiments due to manual supervision and expensive equipment. This bottleneck limits broader inferences that can be achieved with numerous long-term recorded samples.Approach.We developed Piphys, an inexpensive open source neurophysiological recording platform that consists of both hardware and software. It is easily accessed and controlled via a standard web interface through Internet of Things (IoT) protocols.Main results.We used a Raspberry Pi as the primary processing device along with an Intan bioamplifier. We designed a hardware expansion circuit board and software to enable voltage sampling and user interaction. This standalone system was validated with primary human neurons, showing reliability in collecting neural activity in near real-time.Significance.The hardware modules and cloud software allow for remote control of neural recording experiments as well as horizontal scalability, enabling long-term observations of development, organization, and neural activity at scale.more » « less
An official website of the United States government

