Title: Power Side-Channel Attacks on BNN Accelerators in Remote FPGAs
To lower cost and increase the utilization of Cloud Field-Programmable Gate Arrays (FPGAs), researchers have recently been exploring the concept of multi-tenant FPGAs, where multiple independent users simultaneously share the same remote FPGA. Despite its benefits, multi-tenancy opens up the possibility of malicious users co-locating on the same FPGA as a victim user, and extracting sensitive information. This issue becomes especially serious when the user is running a machine learning algorithm that is processing sensitive or private information. To demonstrate the dangers, this paper presents a remote, power-based side-channel attack on a deep neural network accelerator running in a variety of Xilinx FPGAs and also on Cloud FPGAs using Amazon Web Services (AWS) F1 instances. This work in particular shows how to remotely obtain voltage estimates as a deep neural network inference circuit executes, and how the information can be used to recover the inputs to the neural network. The attack is demonstrated with a binarized convolutional neural network used to recognize handwriting images from the MNIST handwritten digit database. With the use of precise time-to-digital converters for remote voltage estimation, the MNIST inputs can be successfully recovered with a maximum normalized cross-correlation of 79% between the input image and the recovered image on local FPGA boards and 72% on AWS F1 instances. The attack requires no physical access nor modifications to the FPGA hardware. more »« less
Moini, Shayan; Tian, Shanquan; Szefer, Jakub; Holcomb, Daniel; Tessier, Russell(
, Design, Automation and Test in Europe Conference (DATE))
null
(Ed.)
Multi-tenant FPGAs have recently been proposed, where multiple independent users simultaneously share a remote FPGA. Despite its benefits for cost and utilization, multi-tenancy opens up the possibility of malicious users extracting sensitive information from co-located victim users. To demonstrate the dangers, this paper presents a remote, power-based side-channel attack on a binarized neural network (BNN) accelerator. This work shows how to remotely obtain voltage estimates as the BNN circuit executes, and how the information can be used to recover the inputs to the BNN. The attack is demonstrated with a BNN used to recognize handwriting images from the MNIST dataset. With the use of precise time-to-digital converters (TDCs) for remote voltage estimation, the MNIST inputs can be successfully recovered with a maximum normalized cross-correlation of 75% between the input image and the recovered image.
Tian, Shanquan; Moini, Shayan; Holcomb, Daniel; Tessier, Russell; Szefer, Jakub(
, 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE))
The security and performance of FPGA-based accelerators play vital roles in today’s cloud services. In addition to supporting convenient access to high-end FPGAs, cloud vendors and third-party developers now provide numerous FPGA accelerators for machine learning models. However, the security of accelerators
developed for state-of-the-art Cloud FPGA environments has not been fully explored, since most remote accelerator attacks have been prototyped on local FPGA boards in lab settings, rather than in Cloud FPGA environments. To address existing research gaps, this work analyzes three existing machine learning accelerators developed in Xilinx Vitis to assess the potential threats of power attacks on accelerators in Amazon Web Services (AWS) F1 Cloud FPGA platforms, in a multi-tenant setting. The experiments show that malicious co-tenants in a multi-tenant environment can instantiate voltage sensing circuits as register-transfer level (RTL) kernels within the Vitis design environment to spy on co-tenant modules. A methodology for launching a practical remote power attack on Cloud FPGAs is also presented, which uses an enhanced time-to-digital (TDC) based voltage sensor and auto-triggered mechanism. The TDC is used to capture power signatures, which are then used to identify power consumption spikes and observe activity patterns involving the FPGA shell, DRAM on the FPGA board, or the other co-tenant victim’s accelerators. Voltage change patterns related to shell use and accelerators are then used to create an auto-triggered attack that can automatically detect when to capture voltage traces without the need for a hard-wired synchronization signal between victim and attacker. To address the novel threats presented in this work, this paper also discusses defenses that could be leveraged to secure multi-tenant Cloud
FPGAs from power-based attacks.
Tian, Shanquan; Szefer, Jakub(
, International Symposium on Field-Programmable Gate Arrays (FPGA))
With increasing interest in Cloud FPGAs, such as Amazon's EC2 F1 instances or Microsoft's Azure with Catapult servers, FPGAs in cloud computing infrastructures can become targets for information leakages via convert channel communication. Cloud FPGAs leverage temporal sharing of the FPGA resources between users. This paper shows that heat generated by one user can be observed by another user who later uses the same FPGA. The covert data transfer can be achieved through simple on-off keying (OOK) and use of multiple FPGA boards in parallel significantly improves data throughput. The new temporal thermal covert channel is demonstrated on Microsoft's Catapult servers with FPGAs running remotely in the Texas Advanced Computing Center (TACC). A number of defenses against the new temporal thermal covert channel are presented at the end of the paper.
Giechaskiel, Ilias; Tian, Shanquan; Szefer, Jakub(
, ACM Transactions on Reconfigurable Technology and Systems)
The availability of FPGAs in cloud data centers offers rapid, on-demand access to reconfigurable hardware compute resources that users can adapt to their own needs. However, the low-level access to the FPGA hardware and associated resources such as the PCIe bus, SSD drives, or DRAM modules also opens up threats of malicious attackers uploading designs that are able to infer information about other users or about the cloud infrastructure itself. In particular, this work presents a new, fast PCIe-contention-based channel that is able to transmit data between FPGA-accelerated virtual machines by modulating the PCIe bus usage. This channel further works with different operating systems, and achieves bandwidths reaching 20 kbps with 99% accuracy. This is the first cross-FPGA covert channel demonstrated on commercial clouds, and has a bandwidth which is over 2000 × larger than prior voltage- or temperature-based cross-board attacks. This paper further demonstrates that the PCIe receivers are able to not just receive covert transmissions, but can also perform fine-grained monitoring of the PCIe bus, including detecting when co-located VMs are initialized, even prior to their associated FPGAs being used. Moreover, the proposed mechanism can be used to infer the activities of other users, or even slow down the programming of the co-located FPGAs as well as other data transfers between the host and the FPGA. Beyond leaking information across different virtual machines, the ability to monitor the PCIe bandwidth over hours or days can be used to estimate the data center utilization and map the behavior of the other users. The paper also introduces further novel threats in FPGA-accelerated instances, including contention due to network traffic, contention due to shared NVMe SSDs, as well as thermal monitoring to identify FPGA co-location using the DRAM modules attached to the FPGA boards. This is the first work to demonstrate that it is possible to break the separation of privilege in FPGA-accelerated cloud environments, and highlights that defenses for public clouds using FPGAs need to consider PCIe, SSD, and DRAM resources as part of the attack surface that should be protected.
Sharma, Aakash; Bhasi, Vivek; Singh, Sonali; Jain, Rishabh; Raj, Jashwant; Mitra, Subrata; Kandemir, Mahmut Taylan; Kesidis, George; Das, Chita(
, Proceedings of the International Conference on Distributed Computing Systems)
Deep neural networks (DNNs) are increasingly popular
owing to their ability to solve complex problems such as
image recognition, autonomous driving, and natural language
processing. Their growing complexity coupled with the use of
larger volumes of training data (to achieve acceptable accuracy)
has warranted the use of GPUs and other accelerators. Such
accelerators are typically expensive, with users having to pay a
high upfront cost to acquire them. For infrequent use, users can,
instead, leverage the public cloud to mitigate the high acquisition
cost. However, with the wide diversity of hardware instances
(particularly GPU instances) available in public cloud, it becomes
challenging for a user to make an appropriate choice from a
cost/performance standpoint.
In this work, we try to address this problem by (i) introducing
a comprehensive distributed deep learning (DDL) profiler Stash,
which determines the various execution stalls that DDL suffers
from, and (ii) using Stash to extensively characterize various
public cloud GPU instances by running popular DNN models
on them. Specifically, it estimates two types of communication
stalls, namely, interconnect and network stalls, that play a
dominant role in DDL execution time. Stash is implemented
on top of prior work, DS-analyzer, that computes only the
CPU and disk stalls. Using our detailed stall characterization,
we list the advantages and shortcomings of public cloud GPU
instances for users to help them make an informed decision(s).
Our characterization results indicate that the more expensive
GPU instances may not be the most performant for all DNN
models and that AWS can sometimes sub-optimally allocate
hardware interconnect resources. Specifically, the intra-machine
interconnect can introduce communication overheads of up to
90% of DNN training time and the network-connected instances
can suffer from up to 5× slowdown compared to training on a
single instance. Furthermore, (iii) we also model the impact of
DNN macroscopic features such as the number of layers and the
number of gradients on communication stalls, and finally, (iv)
we briefly discuss a cost comparison with existing work.
Moini, Shayan, Tian, Shanquan, Holcomb, Daniel, Szefer, Jakub, and Tessier, Russell. Power Side-Channel Attacks on BNN Accelerators in Remote FPGAs. Retrieved from https://par.nsf.gov/biblio/10225319. IEEE journal on emerging and selected topics in circuits and systems .
Moini, Shayan, Tian, Shanquan, Holcomb, Daniel, Szefer, Jakub, & Tessier, Russell. Power Side-Channel Attacks on BNN Accelerators in Remote FPGAs. IEEE journal on emerging and selected topics in circuits and systems, (). Retrieved from https://par.nsf.gov/biblio/10225319.
Moini, Shayan, Tian, Shanquan, Holcomb, Daniel, Szefer, Jakub, and Tessier, Russell.
"Power Side-Channel Attacks on BNN Accelerators in Remote FPGAs". IEEE journal on emerging and selected topics in circuits and systems (). Country unknown/Code not available. https://par.nsf.gov/biblio/10225319.
@article{osti_10225319,
place = {Country unknown/Code not available},
title = {Power Side-Channel Attacks on BNN Accelerators in Remote FPGAs},
url = {https://par.nsf.gov/biblio/10225319},
abstractNote = {To lower cost and increase the utilization of Cloud Field-Programmable Gate Arrays (FPGAs), researchers have recently been exploring the concept of multi-tenant FPGAs, where multiple independent users simultaneously share the same remote FPGA. Despite its benefits, multi-tenancy opens up the possibility of malicious users co-locating on the same FPGA as a victim user, and extracting sensitive information. This issue becomes especially serious when the user is running a machine learning algorithm that is processing sensitive or private information. To demonstrate the dangers, this paper presents a remote, power-based side-channel attack on a deep neural network accelerator running in a variety of Xilinx FPGAs and also on Cloud FPGAs using Amazon Web Services (AWS) F1 instances. This work in particular shows how to remotely obtain voltage estimates as a deep neural network inference circuit executes, and how the information can be used to recover the inputs to the neural network. The attack is demonstrated with a binarized convolutional neural network used to recognize handwriting images from the MNIST handwritten digit database. With the use of precise time-to-digital converters for remote voltage estimation, the MNIST inputs can be successfully recovered with a maximum normalized cross-correlation of 79% between the input image and the recovered image on local FPGA boards and 72% on AWS F1 instances. The attack requires no physical access nor modifications to the FPGA hardware.},
journal = {IEEE journal on emerging and selected topics in circuits and systems},
author = {Moini, Shayan and Tian, Shanquan and Holcomb, Daniel and Szefer, Jakub and Tessier, Russell},
editor = {null}
}
Warning: Leaving National Science Foundation Website
You are now leaving the National Science Foundation website to go to a non-government website.
Website:
NSF takes no responsibility for and exercises no control over the views expressed or the accuracy of
the information contained on this site. Also be aware that NSF's privacy policy does not apply to this site.