Abstract—In this article, we provide improvements for the architecture of Supersingular Isogeny Key Encapsulation (SIKE), a post-quantum
cryptography candidate.We develop a new highly optimized Montgomery multiplication algorithm and architecture for prime fields. The
multiplier occupies less area and provide better timing results than the state-of-the-art.We also provide improvements to the scheduling of
SIKE in our programROM.We implement SIKE for all Round 3 NISTsecurity levels (SIKEp434 for NISTsecurity level 1, SIKEp503 for NIST
security level 2, SIKEp610 for NISTsecurity level 3, and SIKEp751 for NISTsecurity level 5) on Xilinx Artix 7 and Xilinx Virtex 7 FPGAs. Our
best implementation (NISTsecurity level 1) runs 38 percent faster and occupies 30 percent less hardware resources in comparison to the
leading counterpart available in the literature and implementations for other security levels achieved similar improvement.
more »
« less
Optimized Algorithms and Architectures for Montgomery Multiplication for Post-quantum Cryptography
Finite field multiplication plays the main role determining the efficiency of public key cryptography systems based on RSA and elliptic curve cryptography (ECC). Most recently, quantum-safe cryptographic systems are proposed based on supersingular isogenies on elliptic curves which require large integer multiplications over extended prime fields. In this work, we present two Montgomery multiplication architectures for special primes used in a post-quantum cryptography system known as supersingular isogeny key encapsulation (SIKE). We optimize two existing Montgomery multiplication algorithms and develop area-efficient and time-efficient Montgomery multiplication architectures for hardware implementations of post-quantum cryptography. Our proposed time-efficient architecture is 32% to 42% faster than the leading one (depending on the prime size) available in the literature which has been used in original SIKE submission to the NIST standardization process. The area-efficient architecture is 42% to 50% smaller than the counterparts and is about 3% to 11% faster depending on the NIST security level.
more »
« less
- Award ID(s):
- 1801341
- PAR ID:
- 10179894
- Date Published:
- Journal Name:
- International Conference on Cryptology and Network Security CANS 2019
- Volume:
- 11829
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The advances in quantum technologies and the fast move toward quantum computing are threatening classical cryptography and urge the deployment of post-quantum (PQ) schemes. The only isogeny-based candidate forming part of the third round of the standardization, the Supersingular Isogeny Key Encapsulation (SIKE) mechanism, is a subject of constant latency optimizations given its attractive compact key size lengths and, thus, its limited bandwidth and memory requirements. In this work, we present a new speed record of the SIKE protocol by implementing novel low-level finite field arithmetic targeting ARMv7-M architecture. We develop a handcrafted assembly code for the modular multiplication and squaring functions where we obtain 8.71% and 5.38% of speedup, respectively, compared to the last best-reported assembly implementations for p434. After deploying the finite field optimized architecture to the SIKE protocol, we observe 5.63%, 3.93%, 3.48%, and 1.61% of latency reduction for SIKE p434, p503, p610, and p751, respectively, targeting the NIST recommended STM32F407VG discovery board for our experiments.more » « less
-
When quantum computers become scalable and reliable, they are likely to break all public-key cryptography standards, such as RSA and Elliptic Curve Cryptography. The projected threat of quantum computers has led the U.S. National Institute of Standards and Technology (NIST) to an effort aimed at replacing existing public-key cryptography standards with new quantum-resistant alternatives. In December 2017, 69 candidates were accepted by NIST to Round 1 of the NIST Post-Quantum Cryptography (PQC) standardization process. NTRUEncrypt is one of the most well-known PQC algorithms that has withstood cryptanalysis. The speed of NTRUEncrypt in software, especially on embedded software platforms, is limited by the long execution time of its primary operation, polynomial multiplication. In this paper, we investigate speeding up NTRUEncrypt using software/hardware codesign on a Xilinx Zynq UltraScale+ multiprocessor system-on-chip (MPSoC). Polynomial multiplication is implemented in the Programmable Logic (PL) of Zynq using two approaches: traditional Register-Transfer Level (RTL) and High-Level Synthesis (HLS). The remaining operations of NTRUEncrypt are executed in software on the Processing System (PS) of Zynq, using the bare-metal mode. The speed-up of our software/hardware codesigns vs. purely software implementations is determined experimentally and analyzed in the paper. The results are reported for the RTL-based and HLS-based hardware accelerators, and compared to the best available software implementation, included in the NIST submission package. The speed-ups for encryption were 2.4 and 3.9, depending on the selected parameter set. For decryption, the corresponding speed-ups were 4.0 and 6.8. In addition, for the polynomial multiplication operation itself, the speed up was in excess of 75. Our code for the NTRUEncrypt polynomial multiplier accelerator is being made open-source for further evaluation on multiple software/hardware platforms.more » « less
-
When quantum computers become scalable and reliable, they are likely to break all public-key cryptography standards, such as RSA and Elliptic Curve Cryptography. The projected threat of quantum computers has led the U.S. National Institute of Standards and Technology (NIST) to an effort aimed at replacing existing public-key cryptography standards with new quantum-resistant alternatives. In December 2017, 69 candidates were accepted by NIST to Round 1 of the NIST Post-Quantum Cryptography (PQC) standardization process. NTRUEncrypt is one of the most well-known PQC algorithms that has withstood cryptanalysis. The speed of NTRUEncrypt in software, especially on embedded software platforms, is limited by the long execution time of its primary operation, polynomial multiplication. In this paper, we investigate speeding up NTRUEncrypt using software/hardware codesign on a Xilinx Zynq UltraScale+ multiprocessor system-on-chip (MPSoC). Polynomial multiplication is implemented in the Programmable Logic (PL) of Zynq using two approaches: traditional Register-Transfer Level (RTL) and High-Level Synthesis (HLS). The remaining operations of NTRUEncrypt are executed in software on the Processing System (PS) of Zynq, using the bare-metal mode. The speed-up of our software/hardware codesigns vs. purely software implementations is determined experimentally and analyzed in the paper. The results are reported for the RTL-based and HLS-based hardware accelerators, and compared to the best available software implementation, included in the NIST submission package. The speed-ups for encryption were 2.4 and 3.9, depending on the selected parameter set. For decryption, the corresponding speed-ups were 4.0 and 6.8. In addition, for the polynomial multiplication operation itself, the speed up was in excess of 75. Our code for the NTRUEncrypt polynomial multiplier accelerator is being made open-source for further evaluation on multiple software/hardware platforms.more » « less
-
We present new results and speedups for the large-degree isogeny computations within the extended supersingular isogeny Diffie-Hellman (eSIDH) key agreement framework. As proposed by Cervantes-Vázquez, Ochoa-Jiménez, and Rodríguez-Henríquez, eSIDH is an extension to SIDH and fourth round NIST post-quantum cryptographic standardization candidate SIKE. By utilizing multiprime large-degree isogenies, eSIDH and eSIKE are faster than the standard SIDH/SIKE and amenable to parallelization techniques that can noticeably increase their speed with multiple cores. Here, we investigate the use of multiprime isogeny strategies to speed up eSIDH and eSIKE in serial implementations. These strategies have been investigated for other isogeny schemes such as CSIDH. We apply them to the eSIDH/eSIKE scenario to speed up the multiprime strategy by about 10%. When applied to eSIDH, we achieve a 7–8% speedup for Bob’s shared key agreement operation. When applied to eSIKE, we achieve a 3–4% speedup for key decapsulation. Historically, SIDH and SIKE have been considerably slower than its competitors in the NIST PQC standardization process. These results continue to highlight the various speedups achievable with the eSIKE framework to alleviate these speed concerns. Though eSIDH and eSIKE are susceptible to the recent devastating attacks on SIKE, our analysis applies to smooth degree isogeny computations in general, and isogeny-based signature schemes which use isogenies of smooth (not necessarily powersmooth) degree.more » « less