Cloud deployments now increasingly provision FPGA accelerators as part of virtual instances. While FPGAs are still essentially single-tenant, the growing demand for hardware acceleration will inevitably lead to the need for methods and architectures supporting FPGA multi-tenancy. In this paper, we propose an architecture supporting space-sharing of FPGA devices among multiple tenants in the cloud. The proposed architecture implements a network-on-chip (NoC) designed for fast data movement and low hardware footprint. Prototyping the proposed architecture on a Xilinx Virtex Ultrascale + demonstrated near specification maximum frequency for on-chip data movement and high throughput in virtual instance access to hardware accelerators. We demonstrate similar performance compared to single-tenant deployment while increasing FPGA utilization (we achieved 6× higher FPGA utilization with our case study), which is one of the major goals of virtualization. Overall, our NoC interconnect achieved about 2× higher maximum frequency than the state-of-the-art and a bandwidth of 25.6 Gbps
This content will become publicly available on June 30, 2023
Deploying Multi-tenant FPGAs within Linux-based Cloud Infrastructure
Cloud deployments now increasingly exploit Field-Programmable Gate Array (FPGA) accelerators as part of virtual instances. While cloud FPGAs are still essentially single-tenant, the growing demand for efficient hardware acceleration paves the way to FPGA multi-tenancy. It then becomes necessary to explore architectures, design flows, and resource management features that aim at exposing multi-tenant FPGAs to the cloud users. In this article, we discuss a hardware/software architecture that supports provisioning space-shared FPGAs in Kernel-based Virtual Machine (KVM) clouds. The proposed hardware/software architecture introduces an FPGA organization that improves hardware consolidation and support hardware elasticity with minimal data movement overhead. It also relies on VirtIO to decrease communication latency between hardware and software domains. Prototyping the proposed architecture with a Virtex UltraScale+ FPGA demonstrated near specification maximum frequency for on-chip data movement and high throughput in virtual instance access to hardware accelerators. We demonstrate similar performance compared to single-tenant deployment while increasing FPGA utilization, which is one of the goals of virtualization. Overall, our FPGA design achieved about 2× higher maximum frequency than the state of the art and a bandwidth reaching up to 28 Gbps on 32-bit data width.
- Award ID(s):
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- ACM Transactions on Reconfigurable Technology and Systems
- Page Range or eLocation-ID:
- 1 to 31
- Sponsoring Org:
- National Science Foundation
More Like this
FPGAs are getting an increasing interest from public clouds and cloud research projects. They are particularly attractive because of their ability to serve as energy efficient and customizable hardware accelerators. Commercial clouds have however highlighted the lack of multi-tenancy support, which does not permit hardware consolidation as it is not possible to space-share FPGA resources between multiple tenants. In this paper, we propose an architecture that divides the FPGA into logically isolated regions that we call ” virtual regions ” (VR). The VRs are immersed in a NoC interconnect allowing flexible communication, fast data movement, and low hardware footprint. The proposed architecture enables multitenancy as VRs can be allocated to different tenants at runtime.
Cloud and data center applications increasingly leverage FPGAs because of their performance/watt benefits and flexibility advantages over traditional processing cores such as CPUs and GPUs. As the rising demand for hardware acceleration gradually leads to FPGA multi-tenancy in the cloud, there are rising concerns about the security challenges posed by FPGA virtualization. Exposing space-shared FPGAs to multiple cloud tenants may compromise the confidentiality, integrity, and availability of FPGA-accelerated applications. In this work, we present a hardware/software architecture for domain isolation in FPGA-accelerated clouds and data centers with a focus on software-based attacks aiming at unauthorized access and information leakage. Our proposed architecture implements Mandatory Access Control security policies from software down to the hardware accelerators on FPGA. Our experiments demonstrate that the proposed architecture protects against such attacks with minimal area and communication overhead.
With the deployment of artificial intelligent (AI) algorithms in a large variety of applications, there creates an increasing need for high-performance computing capabilities. As a result, different hardware platforms have been utilized for acceleration purposes. Among these hardware-based accelerators, the field-programmable gate arrays (FPGAs) have gained a lot of attention due to their re-programmable characteristics, which provide customized control logic and computing operators. For example, FPGAs have recently been adopted for on-demand cloud services by the leading cloud providers like Amazon and Microsoft, providing acceleration for various compute-intensive tasks. While the co-residency of multiple tenants on a cloud FPGA chip increases the efficiency of resource utilization, it also creates unique attack surfaces that are under-explored. In this paper, we exploit the vulnerability associated with the shared power distribution network on cloud FPGAs. We present a stealthy power attack that can be remotely launched by a malicious tenant, shutting down the entire chip and resulting in denial-of-service for other co-located benign tenants. Specifically, we propose stealthy-shutdown: a well-timed power attack that can be implemented in two steps: (1) an attacker monitors the realtime FPGA power-consumption detected by ring-oscillator-based voltage sensors, and (2) when capturing high power-consuming moments, i.e., the power consumptionmore »
Deep-Dup: An Adversarial Weight Duplication Attack Framework to Crush Deep Neural Network in Multi-Tenant FPGAThe wide deployment of Deep Neural Networks (DNN) in high-performance cloud computing platforms brought to light multi-tenant cloud field-programmable gate arrays (FPGA) as a popular choice of accelerator to boost performance due to its hardware reprogramming flexibility. Such a multi-tenant FPGA setup for DNN acceleration potentially exposes DNN interference tasks under severe threat from malicious users. This work, to the best of our knowledge, is the first to explore DNN model vulnerabilities in multi-tenant FPGAs. We propose a novel adversarial attack framework: Deep-Dup, in which the adversarial tenant can inject adversarial faults to the DNN model in the victim tenant of FPGA. Specifically, she can aggressively overload the shared power distribution system of FPGA with malicious power-plundering circuits, achieving adversarial weight duplication (AWD) hardware attack that duplicates certain DNN weight packages during data transmission between off-chip memory and on-chip buffer, to hijack the DNN function of the victim tenant. Further, to identify the most vulnerable DNN weight packages for a given malicious objective, we propose a generic vulnerable weight package searching algorithm, called Progressive Differential Evolution Search (P-DES), which is, for the first time, adaptive to both deep learning white-box and black-box attack models. The proposed Deep-Dup is experimentally validatedmore »