Search for: All records

Creators/Authors contains: "Farahmand, Farnoud"

« Prev Next »

Total Resources

13

Resource Type
Conference Paper

12

Conference Proceeding

0

Dataset

0

Journal Article

1

Workshop Report

0

Availability
Full Text / Resource Available

13

Citation Only

0

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Side-channel Resistant Implementations of a Novel Lightweight Authenticated Cipher with Application to Hardware Security

https://doi.org/10.1145/3453688.3461761

Abdulgadir, Abubakr ; Lin, Sammy ; Farahmand, Farnoud ; Kaps, Jens-Peter ; Gaj, Kris ( June 2021 , GLSVLSI '21: Proceedings of the 2021 on Great Lakes Symposium on VLSI)
null (Ed.)
Lightweight authenticated ciphers are crucial in many resource-constrained applications, including hardware security. To protect Intellectual Property (IPs) from theft and reverse-engineering, multiple obfuscation methods have been developed. An essential component of such schemes is the need for secrecy and authenticity of the obfuscation keys. Such keys may need to be exchanged through the unprotected channels, and their recovery attempted using side-channel attacks. However, the use of the current AES-GCM standard to protected key exchange requires a substantial area and power overhead. NIST is currently coordinating a standardization process to select lightweight algorithms for resource-constrained applications. Although security against cryptanalysis is paramount, cost, performance, and resistance to side-channel attacks are among the most important selection criteria. Since the cost of protection against side-channel attacks is a function of the algorithm, quantifying this cost is necessary for estimating its cost and performance in real-world applications. In this work, we investigate side-channel resistant lightweight implementations of an authenticated cipher TinyJAMBU, one of ten finalists in the current NIST LWC standardization process. Our results demonstrate that these implementations achieve robust security against side-channel attacks while keeping the area and power consumption significantly lower than it is possible using the current standards.
more » « less
Full Text Available
Hardware Benchmarking of Round 2 Candidates in the NIST Lightweight Cryptography Standardization Process

https://doi.org/10.23919/DATE51398.2021.9473930

Mohajerani, Kamyar ; Haeussler, Richard ; Nagpal, Rishub ; Farahmand, Farnoud ; Abdulgadir, Abubakr ; Kaps, Jens-Peter ; Gaj, Kris ( February 2021 , 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE))
null (Ed.)
Twenty five Round 2 candidates in the NIST Lightweight Cryptography (LWC) process have been implemented in hardware by groups from all over the world. All implementations compliant with the LWC Hardware API, proposed in 2019, have been submitted for hardware benchmarking to George Mason University's LWC benchmarking team. The received submissions were first verified for correct functionality and compliance with the hardware API's specification. Then, the execution times in clock cycles, as a function of input sizes, have been determined using behavioral simulation. The compatibility of all implementations with FPGA toolsets from three major vendors, Xilinx, Intel, and Lattice Semiconductor was verified. Optimized values of the maximum clock frequency and resource utilization metrics, such as the number of look-up tables (LUTs) and flip-flops (FFs), were obtained by running optimization tools, such as Minerva, ATHENa, and Xeda. The raw post-place and route results were then converted into values of the corresponding throughputs for long, medium-size, and short inputs. The results were presented in the form of easy to interpret graphs and tables, demonstrating the relative performance of all investigated algorithms. An effort was made to make the entire process as transparent as possible and results easily reproducible by other groups.
more » « less
Full Text Available
Implementation and Benchmarking of Round 2 Candidates in the NIST Post-Quantum Cryptography Standardization Process Using Hardware and Software/Hardware Co-design Approaches

Dang, Viet B. ; Farahmand, Farnoud ; Andrzejczak, Michal ; Mohajerani, Kamyar ; Nguyen, Duc T. ; Gaj, Kris ( October 2020 , Cryptology ePrint Archive: Report 2020/795)
null (Ed.)
Performance in hardware has typically played a major role in differentiating among leading candidates in cryptographic standardization efforts. Winners of two past NIST cryptographic contests (Rijndael in case of AES and Keccak in case of SHA-3) were ranked consistently among the two fastest candidates when implemented using FPGAs and ASICs. Hardware implementations of cryptographic operations may quite easily outperform software implementations for at least a subset of major performance metrics, such as speed, power consumption, and energy usage, as well as in terms of security against physical attacks, including side-channel analysis. Using hardware also permits much higher flexibility in trading one subset of these properties for another. A large number of candidates at the early stages of the standardization process makes the accurate and fair comparison very challenging. Nevertheless, in all major past cryptographic standardization efforts, future winners were identified quite early in the evaluation process and held their lead until the standard was selected. Additionally, identifying some candidates as either inherently slow or costly in hardware helped to eliminate a subset of candidates, saving countless hours of cryptanalysis. Finally, early implementations provided a baseline for future design space explorations, paving a way to more comprehensive and fairer benchmarking at the later stages of a given cryptographic competition. In this paper, we first summarize, compare, and analyze results reported by other groups until mid-May 2020, i.e., until the end of Round 2 of the NIST PQC process. We then outline our own methodology for implementing and benchmarking PQC candidates using both hardware and software/hardware co-design approaches. We apply our hardware approach to 6 lattice-based CCA-secure Key Encapsulation Mechanisms (KEMs), representing 4 NIST PQC submissions. We then apply a software-hardware co-design approach to 12 lattice-based CCA-secure KEMs, representing 8 Round 2 submissions. We hope that, combined with results reported by other groups, our study will provide NIST with helpful information regarding the relative performance of a significant subset of Round 2 PQC candidates, assuming that at least their major operations, and possibly the entire algorithms, are off-loaded to hardware.
more » « less
Full Text Available
Implementation and Benchmarking of Round 2 Candidates in the NIST Post-Quantum Cryptography Standardization Process Using Hardware and Software/Hardware Co-design Approaches

Dang, Viet B. ; Farahmand, Farnoud ; Andrzejczak, Michal ; Mohajerani, Kamyar ; Nguyen, Duc T. ; Gaj, K. ( June 2020 , Cryptology ePrint Archive: Report 2020/795)

Performance in hardware has typically played a major role in differentiating among leading candidates in cryptographic standardization efforts. Winners of two past NIST cryptographic contests (Rijndael in case of AES and Keccak in case of SHA-3) were ranked consistently among the two fastest candidates when implemented using FPGAs and ASICs. Hardware implementations of cryptographic operations may quite easily outperform software implementations for at least a subset of major performance metrics, such as speed, power consumption, and energy usage, as well as in terms of security against physical attacks, including side-channel analysis. Using hardware also permits much higher flexibility in trading one subset of these properties for another. A large number of candidates at the early stages of the standardization process makes the accurate and fair comparison very challenging. Nevertheless, in all major past cryptographic standardization efforts, future winners were identified quite early in the evaluation process and held their lead until the standard was selected. Additionally, identifying some candidates as either inherently slow or costly in hardware helped to eliminate a subset of candidates, saving countless hours of cryptanalysis. Finally, early implementations provided a baseline for future design space explorations, paving a way to more comprehensive and fairer benchmarking at the later stages of a given cryptographic competition. In this paper, we first summarize, compare, and analyze results reported by other groups until mid-May 2020, i.e., until the end of Round 2 of the NIST PQC process. We then outline our own methodology for implementing and benchmarking PQC candidates using both hardware and software/hardware co-design approaches. We apply our hardware approach to 6 lattice-based CCA-secure Key Encapsulation Mechanisms (KEMs), representing 4 NIST PQC submissions. We then apply a software-hardware co-design approach to 12 lattice-based CCA-secure KEMs, representing 8 Round 2 submissions. We hope that, combined with results reported by other groups, our study will provide NIST with helpful information regarding the relative performance of a significant subset of Round 2 PQC candidates, assuming that at least their major operations, and possibly the entire algorithms, are off-loaded to hardware.
more » « less
Full Text Available
Full hardware implementation of the Post-Quantum Public-Key Cryptography Scheme Round5

https://doi.org/10.1109/ReConFig48160.2019.8994765

Andrzejczak, Michal ; Farahmand, Farnoud ; Gaj, Kris ( December 2019 , 2019 International Conference on ReConFigurable Computing and FPGAs (ReConFig), Cancun, Mexico)

We describe a generic high-speed hardware architecture for the lattice-based post-quantum cryptosystem Round5. This architecture supports both public-key encryption (PKE) and a key encapsulation mechanism (KEM). Due to several hardware-friendly features, Round5 can achieve very high performance when implemented in modern FPGAs.
more » « less
Full Text Available
Implementing and Benchmarking Three Lattice-Based Post-Quantum Cryptography Algorithms Using Software/Hardware Codesign

https://doi.org/10.1109/ICFPT47387.2019.00032

Dang, Viet B. ; Farahmand, Farnoud ; Andrzejczak, Michal ; Gaj, Kris ( December 2019 , 2019 International Conference on Field-Programmable Technology (ICFPT), Tianjin, China)

It has been predicted that within the next tenfifteen years, quantum computers will have computational power sufficient to break current public-key cryptography schemes. When that happens, all traditional methods of dealing with the growing computational capabilities of potential attackers, such as increasing key sizes, will be futile. The only viable solution is to develop new standards based on algorithms that are resistant to quantum computer attacks and capable of being executed on traditional computing platforms, such as microprocessors and FPGAs. Leading candidates for new standards include lattice-based post-quantum cryptography (PQC) algorithms. In this paper, we present the results of implementing and benchmarking three lattice-based key encapsulation mechanisms (KEMs) that have progressed to Round 2 of the NIST standardization process. Our implementations are based on a software/hardware codesign approach, which is particularly applicable to the current stage of the NIST PQC standardization process, where the large number and high complexity of the candidates make traditional hardware benchmarking extremely challenging. We propose and justify the choice of a suitable system-on-chip platform and design methodology. The obtained results indicate the potential for very substantial speed-ups vs. purely software implementations, reaching 28x for encapsulation and 20x for decapsulation.
more » « less
Full Text Available
Software/Hardware Codesign of the Post Quantum Cryptography Algorithm NTRUEncrypt Using High-Level Synthesis and Register-Transfer Level Design Methodologies

Farahmand, Farnoud ; Nguyen, Duc Tri ; Dang, Viet B. ; Ferozpuri, Ahmed ; Gaj, Kris ( September 2019 , 2019 29th International Conference on Field Programmable Logic and Applications (FPL), Barcelona, Spain, Sep. 9-13, 2019)

When quantum computers become scalable and reliable, they are likely to break all public-key cryptography standards, such as RSA and Elliptic Curve Cryptography. The projected threat of quantum computers has led the U.S. National Institute of Standards and Technology (NIST) to an effort aimed at replacing existing public-key cryptography standards with new quantum-resistant alternatives. In December 2017, 69 candidates were accepted by NIST to Round 1 of the NIST Post-Quantum Cryptography (PQC) standardization process. NTRUEncrypt is one of the most well-known PQC algorithms that has withstood cryptanalysis. The speed of NTRUEncrypt in software, especially on embedded software platforms, is limited by the long execution time of its primary operation, polynomial multiplication. In this paper, we investigate speeding up NTRUEncrypt using software/hardware codesign on a Xilinx Zynq UltraScale+ multiprocessor system-on-chip (MPSoC). Polynomial multiplication is implemented in the Programmable Logic (PL) of Zynq using two approaches: traditional Register-Transfer Level (RTL) and High-Level Synthesis (HLS). The remaining operations of NTRUEncrypt are executed in software on the Processing System (PS) of Zynq, using the bare-metal mode. The speed-up of our software/hardware codesigns vs. purely software implementations is determined experimentally and analyzed in the paper. The results are reported for the RTL-based and HLS-based hardware accelerators, and compared to the best available software implementation, included in the NIST submission package. The speed-ups for encryption were 2.4 and 3.9, depending on the selected parameter set. For decryption, the corresponding speed-ups were 4.0 and 6.8. In addition, for the polynomial multiplication operation itself, the speed up was in excess of 75. Our code for the NTRUEncrypt polynomial multiplier accelerator is being made open-source for further evaluation on multiple software/hardware platforms.
more » « less
Full Text Available
Software/Hardware Codesign of the Post Quantum Cryptography Algorithm NTRUEncrypt Using High-Level Synthesis and Register-Transfer Level Design Methodologies

https://doi.org/10.1109/FPL.2019.00042

Farahmand, Farnoud ; Nguyen, Duc Tri ; Dang, Viet B. ; Ferozpuri, Ahmed ; Gaj, Kris ( September 2019 , 2019 29th International Conference on Field Programmable Logic and Applications (FPL), Barcelona, Spain)

When quantum computers become scalable and reliable, they are likely to break all public-key cryptography standards, such as RSA and Elliptic Curve Cryptography. The projected threat of quantum computers has led the U.S. National Institute of Standards and Technology (NIST) to an effort aimed at replacing existing public-key cryptography standards with new quantum-resistant alternatives. In December 2017, 69 candidates were accepted by NIST to Round 1 of the NIST Post-Quantum Cryptography (PQC) standardization process. NTRUEncrypt is one of the most well-known PQC algorithms that has withstood cryptanalysis. The speed of NTRUEncrypt in software, especially on embedded software platforms, is limited by the long execution time of its primary operation, polynomial multiplication. In this paper, we investigate speeding up NTRUEncrypt using software/hardware codesign on a Xilinx Zynq UltraScale+ multiprocessor system-on-chip (MPSoC). Polynomial multiplication is implemented in the Programmable Logic (PL) of Zynq using two approaches: traditional Register-Transfer Level (RTL) and High-Level Synthesis (HLS). The remaining operations of NTRUEncrypt are executed in software on the Processing System (PS) of Zynq, using the bare-metal mode. The speed-up of our software/hardware codesigns vs. purely software implementations is determined experimentally and analyzed in the paper. The results are reported for the RTL-based and HLS-based hardware accelerators, and compared to the best available software implementation, included in the NIST submission package. The speed-ups for encryption were 2.4 and 3.9, depending on the selected parameter set. For decryption, the corresponding speed-ups were 4.0 and 6.8. In addition, for the polynomial multiplication operation itself, the speed up was in excess of 75. Our code for the NTRUEncrypt polynomial multiplier accelerator is being made open-source for further evaluation on multiple software/hardware platforms.
more » « less
Full Text Available
Evaluating the Potential for Hardware Acceleration of Four NTRU-Based Key Encapsulation Mechanisms Using Software/Hardware Codesign

https://doi.org/https://doi.org/10.1007/978-3-030-25510-7_2

Farahmand, Farnoud ; Dang, Viet B. ; Nguyen, Duc Tri ; Gaj, Kris ( May 2019 , Post-Quantum Cryptography, 10th International Conference, PQCrypto 2019 Chongqing, China, May 8–10, 2019 Revised Selected Papers, Lecture Notes in Computer Science)

The speed of NTRU-based Key Encapsulation Mechanisms (KEMs) in software, especially on embedded software platforms, is limited by the long execution time of its primary operation, polynomial multiplication. In this paper, we investigate the potential for speeding up the implementations of four NTRU-based KEMs, using software/hardware codesign, when targeting Xilinx Zynq UltraScale+ multiprocessor system-on-chip (MPSoC). All investigated algorithms compete in Round 1 of the NIST PQC standardization process. They include: ntru-kem from the NTRUEncrypt submission, Streamlined NTRU Prime and NTRU LPRime KEMs of the NTRU Prime candidate, and NTRU- HRSS-KEM from the submission of the same name. The most time-consuming operation, polynomial multiplication, is implemented in the Programmable Logic (PL) of Zynq UltraScale+ (i.e., in hardware) using constant-time hardware architectures most appropriate for a given algorithm. The remaining operations are executed in the Processing System (PS) of Zynq, based on the ARM Cortex-A53 Application Processing Unit. The speed-ups of our software/hardware codesigns vs. purely software implementations, running on the same Zynq platform, are determined experimentally, and analyzed in the paper. Our experiments reveal substantial differences among the investigated candidates in terms of their potential to benefit from hardware accelerators, with the special focus on accelerators aimed at offloading to hardware only the most time-consuming operation of a given cryptosystems. The demonstrated speed-ups vs. functionally equivalent purely software implementations vary between 4.0 and 42.7 for encapsulation, and between 6.4 and 149.7 for decapsulation.
more » « less
Full Text Available
Face-off between the CAESAR Lightweight Finalists: ACORN vs. Ascon

Diehl, William ; Farahmand, Farnoud ; Abdulgadir, Abubakr ; Kaps, Jens-Peter ; Gaj, Kris ( December 2018 , 2018 International Conference on Field-Programmable Technology)

Authenticated ciphers potentially provide resource savings and security improvements over the joint use of secret-key ciphers and message authentication codes. The CAESAR competition aims to choose the most suitable authenticated ciphers for several categories of applications, including a lightweight use case, for which the primary criteria are performance in resource constrained devices, and ease of protection against side channel attacks (SCA). Recently, two of the candidates from this category, ACORN and Ascon, were selected as CAESAR contest finalists. In this research, we compare two SCA-resistant FPGA implementations of ACORN and Ascon, where one set of implementations has area consumption nearly equivalent to the defacto standard AES-GCM, and the other set has throughput (TP) close to that of AES-GCM. The results show that protected implementations of ACORN and Ascon, with area consumption less than but close to AES-GCM, have 23.3 and 2.5 times, respectively, the TP of AES-GCM. Likewise, implementations of ACORN and Ascon with TP greater than but close to AES-GCM, consume 18% and 74% of the area, respectively, of AES-GCM.
more » « less
Full Text Available

« Prev Next »