skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, October 10 until 2:00 AM ET on Friday, October 11 due to maintenance. We apologize for the inconvenience.


Search for: All records

Award ID contains: 1716541

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. We present the first specification-compliant constant-time FPGA implementation of the Classic McEliece cryptosystem from the third-round of NIST’s Post-Quantum Cryptography standardization process. In particular, we present the first complete implementation including encapsulation and decapsulation modules as well as key generation with seed expansion. All the hardware modules are parametrizable, at compile time, with security level and performance parameters. As the most time consuming operation of Classic McEliece is the systemization of the public key matrix during key generation, we present and evaluate three new algorithms that can be used for systemization while complying with the specification: hybrid early-abort systemizer (HEA), single-pass early-abort systemizer (SPEA), and dual-pass earlyabort systemizer (DPEA). All of the designs outperform the prior systemizer designs for Classic McEliece by 2.2x to 2.6x in average runtime and by 1.7x to 2.4x in time-area efficiency. We show that our complete Classic McEliece design for example can perform key generation in 5.2 ms to 20 ms, encapsulation in 0.1 ms to 0.5 ms, and decapsulation in 0.7 ms to 1.5 ms for all security levels on an Xlilinx Artix 7 FPGA. The performance can be increased even further at the cost of resources by increasing the level of parallelization using the performance parameters of our design.

     
    more » « less
  2. null (Ed.)
    Information leakage in FPGAs poses a danger whenever multiple users share the reconfigurable fabric, for example in multi-tenant Cloud FPGAs, or whenever a potentially malicious IP module is synthesized within a single user's design on an FPGA. In such scenarios, capacitive crosstalk between so-called long routing wires has been previously shown to be a security vulnerability in both Xilinx and Intel FPGAs. Specifically, both static and dynamic values on long wires have been demonstrated to affect the delays of the adjacent long wires, and such delay changes have been exploited to steal sensitive information such as bits of cryptographic keys. While long-wire leakage is now well-understood and can be defended against, this work presents two other, new types of information leaks that pose similar risks, but which have not been studied in the past, and for which existing defenses do not work. First, this paper shows that other types of routing resources (namely medium wires) are also vulnerable to crosstalk, with changes in their delays also measurable fully on-chip. Second, this work introduces a novel source of information leaks that originates from logic elements within the FPGA Configurable Logic Blocks (CLBs) and is likely not the result of the capacitive crosstalk effects investigated in prior work. To understand the potential impact of the two new leakage sources, this paper experimentally characterizes and compares them in four families of Xilinx FPGAs, and discusses potential countermeasures in the context of existing attacks and defenses. 
    more » « less
  3. null (Ed.)
    This paper presents the first 28 nm ASIC implementation of an accelerator for the post-quantum digital signature scheme XMSS. In particular, this paper presents an architecture for a novel, pipelined XMSS Leaf accelerator for accelerating the most compute-intensive step in the XMSS algorithm. This paper then presents the ASIC designs for both an existing non-pipelined accelerator architecture and the novel, pipelined XMSS Leaf accelerator. In addition, the performance of the28 nm ASIC is compared to the same designs on 28 nm Artix-7FPGAs. The novel pipelined XMSS Leaf accelerator is 25% faster compared to the non-pipelined version in the ASIC, and both accelerator architectures have a 10×lower power consumption than on the FPGAs. The evaluation shows that the pipelining increases the frequency by 1.7×on the FPGA but only 1.2×on the ASIC, due to the critical path in the ASIC being in the memory. The non-pipelined XMSS Leaf accelerator is shown to have a significantly better area-delay and energy-delay metric on the ASIC, while the pipelined accelerator wins out in these metrics on the FPGA. Consequently, this work shows the different architectural decisions that need to be made between FPGA and ASIC designs, when selecting how to best implement a post-quantum cryptographic accelerator in hardware. 
    more » « less
  4. This paper presents a set of efficient and parameterized hardware accelerators that target post-quantum lattice-based cryptographic schemes, including a versatile cSHAKE core, a binary-search CDT-based Gaussian sampler, and a pipelined NTT-based polynomial multiplier, among others. Unlike much of prior work, the accelerators are fully open-sourced, are designed to be constant-time, and can be parameterized at compile-time to support different parameters without the need for re-writing the hardware implementation. These flexible, publicly-available accelerators are leveraged to demonstrate the first hardware-software co-design using RISC-V of the post-quantum lattice-based signature scheme qTESLA with provably secure parameters. In particular, this work demonstrates that the NIST’s Round 2 level 1 and level 3 qTESLA variants achieve over a 40-100x speedup for key generation, about a 10x speedup for signing, and about a 16x speedup for verification, compared to the baseline RISC-V software-only implementation. For instance, this corresponds to execution in 7.7, 34.4, and 7.8 milliseconds for key generation, signing, and verification, respectively, for qTESLA’s level 1 parameter set on an Artix-7 FPGA, demonstrating the feasibility of the scheme for embedded applications. 
    more » « less