skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: HPMA-NTRU: High-Performance Polynomial Multiplication Accelerator for NTRU
Along the rapid development of large-scale quantum computers, post-quantum cryptography (PQC) has drawn significant attention from research community recently as it is proven that the existing public-key cryptosystems are vulnerable to the quantum attacks. Meanwhile, the recent trend in the PQC field has gradually switched to the hardware acceleration aspect. Following this trend, this work presents a novel implementation of a High-performance Polynomial Multiplication hardware Accelerator for NTRU (HPMA-NTRU) under different parameter settings, one of the lattice-based PQC algorithm that is currently under the consideration by the National Institute of Standards and Technology (NIST) PQC standardization process. In total, we have carried out three layers of efforts to obtain the proposed work. First of all, we have proposed a new schoolbook algorithm based strategy to derive the desired polynomial multiplication algorithm for NTRU. Then, we have mapped the algorithm to build a high-performance polynomial multiplication hardware accelerator and have extended this hardware accelerator to different parameter settings with proper adjustment. Finally, through a series of complexity analysis and implementation based comparison, we have shown that the proposed hardware accelerator obtains better area-time complexities than the state-of-the-art one. The outcome of this work is important and will impact the ongoing NIST PQC standardization process and can be deployed further to construct efficient NTRU cryptoprocessors.  more » « less
Award ID(s):
2020625
PAR ID:
10464946
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
2022 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)
Page Range / eLocation ID:
1 to 6
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The speed of NTRU-based Key Encapsulation Mechanisms (KEMs) in software, especially on embedded software platforms, is limited by the long execution time of its primary operation, polynomial multiplication. In this paper, we investigate the potential for speeding up the implementations of four NTRU-based KEMs, using software/hardware codesign, when targeting Xilinx Zynq UltraScale+ multiprocessor system-on-chip (MPSoC). All investigated algorithms compete in Round 1 of the NIST PQC standardization process. They include: ntru-kem from the NTRUEncrypt submission, Streamlined NTRU Prime and NTRU LPRime KEMs of the NTRU Prime candidate, and NTRU- HRSS-KEM from the submission of the same name. The most time-consuming operation, polynomial multiplication, is implemented in the Programmable Logic (PL) of Zynq UltraScale+ (i.e., in hardware) using constant-time hardware architectures most appropriate for a given algorithm. The remaining operations are executed in the Processing System (PS) of Zynq, based on the ARM Cortex-A53 Application Processing Unit. The speed-ups of our software/hardware codesigns vs. purely software implementations, running on the same Zynq platform, are determined experimentally, and analyzed in the paper. Our experiments reveal substantial differences among the investigated candidates in terms of their potential to benefit from hardware accelerators, with the special focus on accelerators aimed at offloading to hardware only the most time-consuming operation of a given cryptosystems. The demonstrated speed-ups vs. functionally equivalent purely software implementations vary between 4.0 and 42.7 for encapsulation, and between 6.4 and 149.7 for decapsulation. 
    more » « less
  2. The rapid advancement in quantum technology has initiated a new round of exploration of efficient implementation of post-quantum cryptography (PQC) on hardware platforms. Key encapsulation mechanism (KEM) Saber, a module lattice-based PQC, is one of the four encryption scheme finalists in the third-round National Institute of Standards and Technology (NIST) standardization process. In this paper, we propose a novel Toeplitz Matrix-Vector Product (TMVP)-based design strategy to efficiently implement polynomial multiplication (essential arithmetic operation) for KEM Saber. The proposed work consists of three layers of interdependent efforts: (i) first of all, we have formulated the polynomial multiplication of KEM Saber into a desired mathematical form for further developing into the proposed TMVP-based algorithm for high-performance operation; (ii) then, we have followed the proposed TMVP-based algorithm to innovatively transfer the derived algorithm into a unified polynomial multiplication structure (fits all security ranks) with the help of a series of algorithm-to-architecture co-implementation/mapping techniques; (iii) finally, detailed implementation results and complexity analysis have confirmed the efficiency of the proposed TMVP design strategy. Specifically, the field-programmable gate array (FPGA) implementation results show that the proposed design has at least less 30.92% area-delay product (ADP) than the competing ones. 
    more » « less
  3. Post-quantum cryptography (PQC) has gained sub-stantial attention from various communities recently. Along with the ongoing National Institute of Standards and Technology (NIST) PQC standardization process that targets the general-purpose PQC algorithms, the research community is also looking for efficient lightweight PQC schemes. Among this direction of efforts, Ring-Binary-Learning-with-Errors (RBLWE)-based encryption scheme (RBLWE-ENC) is regarded as a promising lightweight PQC fitting Internet-of-Things (IoT) and edge computing applications. As hardware implementation for PQC algorithms has become one of the major advances in the field, in this paper, we follow this trend to present an efficient implementation of RBLWE-ENC lightweight accelerator on the field-programmable gate array (FPGA) platform. Overall, we have demonstrated three coherent interdependent stages of efforts: (i) we have presented detailed derivation processes to formulate the proposed algorithmic operation; (ii) we have then implemented the proposed algorithm into a desired hardware accelerator; and (iii) we provided thorough complexity analysis and comparison to showcase the superior performance of the proposed accelerator over the state-of-the-art designs, e.g., the proposed accelerator with v=8 has at least 66.67% less area-time complexities than the existing ones (Virtex-7 FPGA). We hope the outcome of this work can facilitate lightweight PQC development. 
    more » « less
  4. The rapid advancement in quantum technology has initiated a new round of post-quantum cryptography (PQC) related exploration. The key encapsulation mechanism (KEM) Saber is an important module lattice-based PQC, which has been selected as one of the PQC finalists in the ongoing National Institute of Standards and Technology (NIST) standardization process. On the other hand, however, efficient hardware implementation of KEM Saber has not been well covered in the literature. In this paper, therefore, we propose a novel cyclic-row oriented processing (CROP) strategy for efficient implementation of the key arithmetic operation of KEM Saber, i.e., the polynomial multiplication. The proposed work consists of three layers of interdependent efforts: (i) first of all, we have formulated the main operation of KEM Saber into desired mathematical forms to be further developed into CROP based algorithms, i.e., the basic version and the advanced higher-speed version; (ii) then, we have followed the proposed CROP strategy to innovatively transfer the derived two algorithms into desired polynomial multiplication structures with the help of a series of algorithm-architecture co-implementation techniques; (iii) finally, detailed complexity analysis and implementation results have shown that the proposed polynomial multiplication structures have better area-time complexities than the state-of-the-art solutions. Specifically, the field-programmable gate array (FPGA) implementation results show that the proposed design, e.g., the basic version has at least less 11.2% area-delay product (ADP) than the best competing one (Cyclone V device). The proposed high-performance polynomial multipliers offer not only efficient operation for output results delivery but also possess low-complexity feature brought by CROP strategy. The outcome of this work is expected to provide useful references for further development and standardization process of KEM Saber. 
    more » « less
  5. The recent research in post-quantum cryptography (PQC) field has gradually switched to efficient implementation of PQC algorithms on hardware platforms. As polynomial multiplication is typically one of the critical operations within lattice-based PQC, its hardware acceleration has drawn significant attention from the research community recently. We propose a high-speed processing strategy to construct a new High-performance Polynomial Multiplication Accelerator (HPMA) for key encapsulation mechanism (KEM) Saber. Firstly, we have given a detailed mathematical derivation to obtain a low-latency processing algorithm for Saber polynomial multiplication. Then, we have innovatively used the derived the proposed algorithm to construct a new structure HPMA for FPGA implementation. Lastly, we have demonstrated the superior performance of the proposed HPMA-Saber by comparing with state-of-the-art works. The proposed design strategy is highly efficient and the obtained results can be useful for the PQC research community. 
    more » « less