Abstract—In this article, we provide improvements for the architecture of Supersingular Isogeny Key Encapsulation (SIKE), a post-quantum cryptography candidate.We develop a new highly optimized Montgomery multiplication algorithm and architecture for prime fields. The multiplier occupies less area and provide better timing results than the state-of-the-art.We also provide improvements to the scheduling of SIKE in our programROM.We implement SIKE for all Round 3 NISTsecurity levels (SIKEp434 for NISTsecurity level 1, SIKEp503 for NIST security level 2, SIKEp610 for NISTsecurity level 3, and SIKEp751 for NISTsecurity level 5) on Xilinx Artix 7 and Xilinx Virtex 7 FPGAs. Our best implementation (NISTsecurity level 1) runs 38 percent faster and occupies 30 percent less hardware resources in comparison to the leading counterpart available in the literature and implementations for other security levels achieved similar improvement.
Optimized Algorithms and Architectures for Montgomery Multiplication for Post-quantum Cryptography
Finite field multiplication plays the main role determining the efficiency of public key cryptography systems based on RSA and elliptic curve cryptography (ECC). Most recently, quantum-safe cryptographic systems are proposed based on supersingular isogenies on elliptic curves which require large integer multiplications over extended prime fields. In this work, we present two Montgomery multiplication architectures for special primes used in a post-quantum cryptography system known as supersingular isogeny key encapsulation (SIKE). We optimize two existing Montgomery multiplication algorithms and develop area-efficient and time-efficient Montgomery multiplication architectures for hardware implementations of post-quantum cryptography. Our proposed time-efficient architecture is 32% to 42% faster than the leading one (depending on the prime size) available in the literature which has been used in original SIKE submission to the NIST standardization process. The area-efficient architecture is 42% to 50% smaller than the counterparts and is about 3% to 11% faster depending on the NIST security level.
- Award ID(s):
- 1801341
- Publication Date:
- NSF-PAR ID:
- 10179894
- Journal Name:
- International Conference on Cryptology and Network Security CANS 2019
- Volume:
- 11829
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The rapid advancement in quantum technology has initiated a new round of post-quantum cryptography (PQC) related exploration. The key encapsulation mechanism (KEM) Saber is an important module lattice-based PQC, which has been selected as one of the PQC finalists in the ongoing National Institute of Standards and Technology (NIST) standardization process. On the other hand, however, efficient hardware implementation of KEM Saber has not been well covered in the literature. In this paper, therefore, we propose a novel cyclic-row oriented processing (CROP) strategy for efficient implementation of the key arithmetic operation of KEM Saber, i.e., the polynomial multiplication. The proposed work consists of three layers of interdependent efforts: (i) first of all, we have formulated the main operation of KEM Saber into desired mathematical forms to be further developed into CROP based algorithms, i.e., the basic version and the advanced higher-speed version; (ii) then, we have followed the proposed CROP strategy to innovatively transfer the derived two algorithms into desired polynomial multiplication structures with the help of a series of algorithm-architecture co-implementation techniques; (iii) finally, detailed complexity analysis and implementation results have shown that the proposed polynomial multiplication structures have better area-time complexities than the state-of-the-art solutions. Specifically, the field-programmablemore »
-
When quantum computers become scalable and reliable, they are likely to break all public-key cryptography standards, such as RSA and Elliptic Curve Cryptography. The projected threat of quantum computers has led the U.S. National Institute of Standards and Technology (NIST) to an effort aimed at replacing existing public-key cryptography standards with new quantum-resistant alternatives. In December 2017, 69 candidates were accepted by NIST to Round 1 of the NIST Post-Quantum Cryptography (PQC) standardization process. NTRUEncrypt is one of the most well-known PQC algorithms that has withstood cryptanalysis. The speed of NTRUEncrypt in software, especially on embedded software platforms, is limited by the long execution time of its primary operation, polynomial multiplication. In this paper, we investigate speeding up NTRUEncrypt using software/hardware codesign on a Xilinx Zynq UltraScale+ multiprocessor system-on-chip (MPSoC). Polynomial multiplication is implemented in the Programmable Logic (PL) of Zynq using two approaches: traditional Register-Transfer Level (RTL) and High-Level Synthesis (HLS). The remaining operations of NTRUEncrypt are executed in software on the Processing System (PS) of Zynq, using the bare-metal mode. The speed-up of our software/hardware codesigns vs. purely software implementations is determined experimentally and analyzed in the paper. The results are reported for the RTL-based and HLS-based hardwaremore »
-
When quantum computers become scalable and reliable, they are likely to break all public-key cryptography standards, such as RSA and Elliptic Curve Cryptography. The projected threat of quantum computers has led the U.S. National Institute of Standards and Technology (NIST) to an effort aimed at replacing existing public-key cryptography standards with new quantum-resistant alternatives. In December 2017, 69 candidates were accepted by NIST to Round 1 of the NIST Post-Quantum Cryptography (PQC) standardization process. NTRUEncrypt is one of the most well-known PQC algorithms that has withstood cryptanalysis. The speed of NTRUEncrypt in software, especially on embedded software platforms, is limited by the long execution time of its primary operation, polynomial multiplication. In this paper, we investigate speeding up NTRUEncrypt using software/hardware codesign on a Xilinx Zynq UltraScale+ multiprocessor system-on-chip (MPSoC). Polynomial multiplication is implemented in the Programmable Logic (PL) of Zynq using two approaches: traditional Register-Transfer Level (RTL) and High-Level Synthesis (HLS). The remaining operations of NTRUEncrypt are executed in software on the Processing System (PS) of Zynq, using the bare-metal mode. The speed-up of our software/hardware codesigns vs. purely software implementations is determined experimentally and analyzed in the paper. The results are reported for the RTL-based and HLS-based hardwaremore »
-
Yung, M ; L, Shujun. (Ed.)In 2016, the National Institute of Standards and Technology (NIST) initiated a standardization process among the post-quantum secure algorithms. Forming part of the alternate group of candidates after Round 2 of the process is the Supersingular Isogeny Key Encapsulation (SIKE) mechanism which attracts with the smallest key sizes offering post-quantum security in scenarios of limited bandwidth and memory resources. Even further reduction of the exchanged information is offered by the compression mechanism, proposed by Azarderakhsh et al., which, however, introduces a significant time overhead and increases the memory requirements of the protocol, making it challenging to integrate it into an embedded system. In this paper, we propose the first compressed SIKE implementation for a resource-constrained device, where we targeted the NIST recommended platform STM32F407VG featuring ARM Cortex-M4 processor. We integrate the isogeny-based implementation strategies described previously in the literature into the compressed version of SIKE. Additionally, we propose a new assembly design for the finite field operations particular for the compressed SIKE, and observe a speedup of up to 16% and up to 25% compared to the last best-reported assembly implementations for p434, p503, and p610.