skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: PUF-Kyber: Design of a PUF-Based Kyber Architecture Benchmarked on Diverse ARM Processors
It is well-studied that quantum computing breaks the security of the current worldwide implemented public key cryptosystems. This forces us toward post quantum cryptography (PQC) whose security remains solid even against adversaries having access to quantum computers. For this matter, national institute of standards and technology (NIST) announced four winners in 2022. Among them, CRYSTALS-Kyber which is the only key encapsulation mechanism (KEM)/PKE algorithm, is the aim of this article. In this article, through using physical unclonable functions (PUFs) and true random number generators (TRNGs), we improve the overall security of Kyber and provide physical security to it. Our implementation results on ARMv7 and ARMv8 architectures, indicate significant speedup, compared to the reference work. For example, for the CCA.KEM-KeyGen() algorithm, we achieved roughly 26%, 13%, and 10% speedup at security levels of 512, 768, and 1024 on ARMv7 implementation, and 25%, 12%, and 10% for ARMv8 implementation. Comparing the implementation results of our design with the reference work indicates that both the security and the system performance are improved.  more » « less
Award ID(s):
2147196
PAR ID:
10586518
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
IEEE transactions on circuits and systems I Regular papers
ISSN:
1558-0806
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Cheon, Jung Hee; Tillich, Jean-Pierre (Ed.)
    This paper focuses on high-speed NEON-based constant-time implementations of multiplication of large polynomials in the NIST PQC KEM Finalists: NTRU, Saber, and CRYSTALS-Kyber. We use the Number Theoretic Transform (NTT)-based multiplication in Kyber, the Toom-Cook algorithm in NTRU, and both types of multiplication in Saber. Following these algorithms and using Apple M1, we improve the decapsulation performance of the NTRU, Kyber, and Saber-based KEMs at the security level 3 by the factors of 8.4, 3.0, and 1.6, respectively, compared to the reference implementations. On Cortex-A72, we achieve the speed-ups by factors varying between 5.7 and 7.5x for the Forward/Inverse NTT in Kyber, and between 6.0 and 7.8x for Toom-Cook in NTRU, over the best existing implementations in pure C. For Saber, when using NEON instructions on Cortex-A72, the implementation based on NTT outperforms the implementation based on the Toom-Cook algorithm by 14% in the case of the MatrixVectorMul function but is slower by 21% in the case of the InnerProduct function. Taking into account that in Saber, keys are not available in the NTT domain, the overall performance of the NTT-based version is very close to the performance of the Toom-Cook version. The differences for the entire decapsulation at the three major security levels (1, 3, and 5) are −4, −2, and +2%, respectively. Our benchmarking results demonstrate that our NEON-based implementations run on an Apple M1 ARM processor are comparable to those obtained using the best AVX2-based implementations run on an AMD EPYC 7742 processor. Our work is the first NEON-based ARMv8 implementation of each of the three NIST PQC KEM finalists. 
    more » « less
  2. Public-key cryptography based on the lattice problem is efficient and believed to be secure in a post-quantum era. In this paper, we introduce carefully-optimized implementations of Kyber encryption schemes for 64-bit ARM Cortex-A processors. Our research contribution includes optimizations for Number Theoretic Transform (NTT), noise sampling, and AES accelerator based symmetric function implementations. The proposed Kyber512 implementation on ARM64 improved previous works by 1.79×, 1.96×, and 2.44× for key generation, encapsulation, and decapsulation, respectively. Moreover, by using AES accelerator in the proposed Kyber512-90s implementation, it is improved by 8.57×, 6.94×, and 8.26× for key generation, encapsulation, and decapsulation, respectively. 
    more » « less
  3. In this work, we present the first highly-optimized implementation of Supersingular Isogeny Key Encapsulation (SIKE) submitted to NIST’s second round of post quantum standardization process, on 64-bit ARMv8 processors. To the best of our knowledge, this work is the first optimized implementation of SIKE round 2 on 64-bit ARM over SIKEp434 and SIKEp610. The proposed library is explicitly optimized for these two security levels and provides constant-time implementation of the SIKE mechanism on ARMv8-powered embedded devices. We adapt different optimization techniques to reduce the total number of underlying arithmetic operations on the filed level. In particular, the benchmark results on embedded processors equipped with ARM Cortex-A53@1.536 GHz show that the entire SIKE round 2 key encapsulation mechanism takes only 84 ms at NIST’s security level 1. Considering SIKE’s extremely small key size in comparison to other candidates, our result implies that SIKE is one of the promising candidates for key encapsulation mechanism on embedded devices in the quantum era. 
    more » « less
  4. CRYSTAL-Kyber (Kyber) is one of the post-quantum cryptography (PQC) key-encapsulation mechanism (KEM) schemes selected during the standardization process. This paper addresses optimization for Kyber architecture with respect to latency and throughput constraints. Specifically, matrix-vector multiplication and number theoretic transform (NTT)-based polynomial multiplication are critical operations and bottle-necks that require optimization. To address this challenge, we propose an algorithm and hardware co-design approach to systematically optimize matrix-vector multiplication and NTT-based polynomial multiplication by employing a novel sub-structure sharing technique in order to reduce computational complexity, i.e., the number of modular multiplications and modular additions/subtractions consumed. The sub-structure sharing approach is inspired by prior fast parallel approaches based on polyphase decomposition. The proposed efficient feed-forward architecture achieves high speed, low latency, and full utilization of all hardware components, which can significantly enhance the overall efficiency of the Kyber scheme. The FPGA implementation results show that our proposed design, using the fast two-parallel structure, leads to an approximate reduction of 90% in execution time (μs) , along with a 66× improvement in throughput performance. 
    more » « less
  5. The development of fifth-generation (5G) technology marks a significant milestone for digital communication systems, providing substantial improvements in data transmission speeds and enabling enhanced connectivity across a wider range of devices. However, this rapid increase in data volume also introduces new challenges related to transmission latency, reliability, and security. This paper introduces KyMLP-LDPC, a novel approach that integrates a multi-layer parallel LDPC (MLP-LDPC) algorithm with Kyber, a post-quantum cryptography scheme, to accelerate and enable reliable and secure transmission. MLP-LDPC partitions the LDPC parity check matrix into processing groups to streamline parallel decoding and minimize message collisions during transmission, thereby accelerating error correction operations. Kyber encrypts data preemptively to safeguard against potential attacks. The effectiveness of our proposed method is evaluated using both image data and signals transmitted through an additive white Gaussian noise communication channel. Evaluation results demonstrate that the proposed method achieves superior performance in terms of error correction capabilities and data security compared to existing approaches. 
    more » « less