skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Low-Latency Preprocessing Architecture for Residue Number System via Flexible Barrett Reduction for Homomorphic Encryption
Data privacy has become a significant concern due to the rapid development of cloud services, Internet of Things, edge devices, and other applications. Homomorphic encryption (HE) addresses the issue by enabling computations to be performed without the decryption of the encrypted message. However, the bottleneck of designing homomorphic encryption hardware is the complexity of computation. To tackle the long integer arithmetic, the residue number system based on the Chinese remainder theorem is used. In this paper, we propose a novel modular reduction architecture that computes the mapping of residual polynomials in parallel with high speed and low latency. We implement our proposed design in the Xilinx Ultrascale+ FPGA board (VCU118). When the input sizes are 360-bit (1440-bit), the frequency is 180MHz (168MHz) with 4 pipelining stages. Also, the area delay product (ADP) of DSP blocks of our design is reduced by 23 and 31 percent, respectively, for 360 and 1440 bits, compared to prior work.  more » « less
Award ID(s):
2243053
PAR ID:
10481206
Author(s) / Creator(s):
;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
IEEE Transactions on Circuits and Systems II: Express Briefs
Volume:
71
Issue:
5
ISSN:
1549-7747
Page Range / eLocation ID:
2784-2788
Subject(s) / Keyword(s):
Homomorphic encryption , residue number system , Chinese remainder theorem , modular reduction , Barrett reduction , hardware accelerator
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper presents \textit{OFHE}, an electro-optical accelerator designed to process Discretized TFHE (DTFHE) operations, which encrypt multi-bit messages and support homomorphic multiplications, lookup table operations and full-domain functional bootstrappings. While DTFHE is more efficient and versatile than other fully homomorphic encryption schemes, it requires 32-, 64-, and 128-bit polynomial multiplications, which can be time-consuming. Existing TFHE accelerators are not easily upgradable to support DTFHE operations due to limited datapaths, a lack of datapath bit-width reconfigurability, and power inefficiencies when processing FFT and inverse FFT (IFFT) kernels. Compared to prior TFHE accelerators, OFHE addresses these challenges by improving the DTFHE operation latency by 8.7\%, the DTFHE operation throughput by $$57\%$$, and the DTFHE operation throughput per Watt by $$94\%$$. 
    more » « less
  2. We consider an architecture of confidential cloud-based control synthesis based on Homomorphic Encryption (HE). Our study is motivated by the recent surge of data-driven control such as deep reinforcement learning, whose heavy computational requirements often necessitate an outsourcing to the third party server. To achieve more flexibility than Partially Homomorphic Encryption (PHE) and less computational overhead than Fully Homomorphic Encryption (FHE), we consider a Reinforcement Learning (RL) architecture over Leveled Homomorphic Encryption (LHE). We first show that the impact of the encryption noise under the Cheon-Kim-Kim-Song (CKKS) encryption scheme on the convergence of the model-based tabular Value Iteration (VI) can be analytically bounded. We also consider secure implementations of TD(0), SARSA(0) and Z-learning algorithms over the CKKS scheme, where we numerically demonstrate that the effects of the encryption noise on these algorithms are also minimal. 
    more » « less
  3. The widespread use of machine learning is changing our daily lives. Unfortunately, clients are often concerned about the privacy of their data when using machine learning-based applications. To address these concerns, the development of privacy-preserving machine learning (PPML) is essential. One promising approach is the use of fully homomorphic encryption (FHE) based PPML, which enables services to be performed on encrypted data without decryption. Although the speed of computationally expensive FHE operations can be significantly boosted by prior ASIC-based FHE accelerators, the performance of key-switching, the dominate primitive in various FHE operations, is seriously limited by their small bit-width datapaths and frequent matrix transpositions. In this paper, we present an electro-optical (EO) PPML accelerator, PriML, to accelerate FHE operations. Its 512-bit datapath supporting 510-bit residues greatly reduces the key-switching cost. We also create an in-scratchpad-memory transpose unit to fast transpose matrices. Compared to prior PPML accelerators, on average, PriML reduces the latency of various machine learning applications by > 94.4% and the energy consumption by > 95%. 
    more » « less
  4. As cloud computing continues to gain widespread adoption, safeguarding the confidentiality of data entrusted to third-party cloud service providers becomes a critical concern. While traditional encryption methods offer protection for data at rest and in transit, they fall short when it comes to where it matters the most, i.e., during data processing. To address this limitation, we present HELM, a framework for privacy-preserving data processing using homomorphic encryption. HELM automatically transforms arbitrary programs expressed in a Hardware Description Language (HDL), such as Verilog, into equivalent homomorphic circuits, which can then be efficiently evaluated using encrypted inputs. HELM features three modes of encrypted evaluation: a) a gate mode that consists of Boolean gates, b) a small-precision lookup table mode which significantly reduces the size of the circuit by combining multiple gates into lookup tables, and c) a high-precision lookup table mode tuned for multi-bit arithmetic evaluations. Finally, HELM introduces a scheduler that leverages the parallelism inherent in arithmetic and Boolean circuits to efficiently evaluate encrypted programs. We evaluate HELM with the ISCAS'85 and ISCAS'89 benchmark suites, as well as real-world applications such as image filtering and neural network inference. In our experimental results, we report that HELM can outperform prior works by up to 65x. 
    more » « less
  5. We say a tuple of NP statements satisfies a monotone policy if , where if and only if is in the NP language. A monotone-policy batch argument (monotone-policy BARG) for NP is a natural extension of regular batch arguments (BARGs) that allows a prover to prove that satisfy a monotone policy P with a proof of size , where is the size of the Boolean circuit computing the NP relation . Previously, Brakerski, Brodsky, Kalai, Lombardi, and Paneth (CRYPTO 2023) and Nassar, Waters, and Wu (TCC 2024) showed how to construct monotone-policy BARGs from (somewhere-extractable) BARGs for NP together with a leveled homomorphic encryption scheme (Brakerski et al.) or an additively homomorphic encryption scheme over a sufficiently-large group (Nassar et al.). In this work, we improve upon both works by showing that BARGs together with additively homomorphic encryption over any group suffices (e.g., over). For instance, we can instantiate the additively homomorphic encryption with the classic Goldwasser-Micali encryption scheme based on the quadratic residuosity (QR) assumption. Then, by appealing to existing compilers, we also obtain a monotone-policy aggregate signature scheme from any BARG and the QR assumption. 
    more » « less