skip to main content


Title: Side-Channel Analysis and Countermeasure Design for Implementation of Curve448 on Cortex-M4
The highly secure Curve448 cryptographic algorithm has been recently recommended by NIST. While this algorithm provides 224-bit security over elliptic curve cryptography, its implementation may still be vulnerable to physical sidechannel attacks. In this paper, we present a speed-optimized implementation on a 32-bit ARM Cortex-M4 platform achieving more than 40% improvement compared to the best previous work. Our design can perform 43 scalar multiplications per second on an STM32F4 working at 168 MHz. At 24 MHz, our proposed implementation takes only 3,740k clock cycles. On the other hand, the security of Curve448 is thoroughly evaluated to have a trade-off between performance and required protection. We apply different effective countermeasures to prevent a subset of side-channel and fault injection attacks at the cost of 8%-22% overhead.  more » « less
Award ID(s):
2101085
NSF-PAR ID:
10425459
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Hardware and Architectural Support for Security and Privacy (HASP’22)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Motivated by the rise of quantum computers, existing public-key cryptosystems are expected to be replaced by post-quantum schemes in the next decade in billions of devices. To facilitate the transition, NIST is running a standardization process which is currently in its final Round. Only three digital signature schemes are left in the competition, among which Dilithium and Falcon are the ones based on lattices. Besides security and performance, significant attention has been given to resistance against implementation attacks that target side-channel leakage or fault injection response. Classical fault attacks on signature schemes make use of pairs of faulty and correct signatures to recover the secret key which only works on deterministic schemes. To counter such attacks, Dilithium offers a randomized version which makes each signature unique, even when signing identical messages. In this work, we introduce a novel Signature Correction Attack which not only applies to the deterministic version but also to the randomized version of Dilithium and is effective even on constant-time implementations using AVX2 instructions. The Signature Correction Attack exploits the mathematical structure of Dilithium to recover the secret key bits by using faulty signatures and the public-key. It can work for any fault mechanism which can induce single bit-flips. For demonstration, we are using Rowhammer induced faults. Thus, our attack does not require any physical access or special privileges, and hence could be also implemented on shared cloud servers. Using Rowhammer attack, we inject bit flips into the secret key s1 of Dilithium, which results in incorrect signatures being generated by the signing algorithm. Since we can find the correct signature using our Signature Correction algorithm, we can use the difference between the correct and incorrect signatures to infer the location and value of the flipped bit without needing a correct and faulty pair. To quantify the reduction in the security level, we perform a thorough classical and quantum security analysis of Dilithium and successfully recover 1,851 bits out of 3,072 bits of secret key $s_{1}$ for security level 2. Fully recovered bits are used to reduce the dimension of the lattice whereas partially recovered coefficients are used to to reduce the norm of the secret key coefficients. Further analysis for both primal and dual attacks shows that the lattice strength against quantum attackers is reduced from 2128 to 281 while the strength against classical attackers is reduced from 2141 to 289. Hence, the Signature Correction Attack may be employed to achieve a practical attack on Dilithium (security level 2) as proposed in Round 3 of the NIST post-quantum standardization process. 
    more » « less
  2. null (Ed.)
    Security of machine learning is increasingly becoming a major concern due to the ubiquitous deployment of deep learning in many security-sensitive domains. Many prior studies have shown external attacks such as adversarial examples that tamper the integrity of DNNs using maliciously crafted inputs. However, the security implication of internal threats (i.e., hardware vulnerabilities) to DNN models has not yet been well understood. In this paper, we demonstrate the first hardware-based attack on quantized deep neural networks–DeepHammer–that deterministically induces bit flips in model weights to compromise DNN inference by exploiting the rowhammer vulnerability. DeepHammer performs an aggressive bit search in the DNN model to identify the most vulnerable weight bits that are flippable under system constraints. To trigger deterministic bit flips across multiple pages within a reasonable amount of time, we develop novel system-level techniques that enable fast deployment of victim pages, memory-efficient rowhammering and precise flipping of targeted bits. DeepHammer can deliberately degrade the inference accuracy of the victim DNN system to a level that is only as good as random guess, thus completely depleting the intelligence of targeted DNN systems. We systematically demonstrate our attacks on real systems against 11 DNN architectures with 4 datasets corresponding to different application domains. Our evaluation shows that DeepHammer is able to successfully tamper DNN inference behavior at run-time within a few minutes. We further discuss several mitigation techniques from both algorithm and system levels to protect DNNs against such attacks. Our work highlights the need to incorporate security mechanisms in future deep learning systems to enhance the robustness against hardware-based deterministic fault injections. 
    more » « less
  3. Bitslicing is a software implementation technique that treats an N-bit processor datapath as N parallel single-bit datapaths. Bitslicing is particularly useful to implement data-parallel algorithms, algorithms that apply the same operation sequence to every element of a vector. Indeed, a bit-wise processor instruction applies the same logical operation to every single-bit slice. A second benefit of bitsliced execution is that the natural spatial redundancy of bitsliced software can support countermeasures against fault attacks. A k-redundant program on an N-bit processor then runs as N/k parallel redundant slices. In this contribution, we combine these two benefits of bitslicing to implement a fault countermeasure for the number-theoretic transform (NTT). The NTT eiciently implements a polynomial multiplication. The internal symmetry of the NTT algorithm lends itself to a data-parallel implementation, and hence it is a good candidate for the redundantly bitsliced implementation. We implement a redundantly bitsliced NTT on an advanced 667MHz ARM Cortex-A9 processor, and study the fault coverage for the protected NTT under optimized electromagnetic fault injection (EMFI). Our work brings two major contributions. First, we show for the irst time how to develop a redundantly bitsliced version of the NTT. We integrate the protected NTT into a full Dilithium signature sequence. Second, we demonstrate an EMFI analysis on a prototype implementation of the Dilithium signature sequence on ARM Cortex-M9. We perform a detailed EM fault-injection parameter search to optimize the location, intensity and timing of injected EM pulses. We demonstrate that, under optimized fault injection parameters, about 10% of the injected faults become potentially exploitable. However, the redundantly bitsliced NTT design is able to catch the majority of these potentially exploitable faults, even when the remainder of the Dilithium algorithm as well as the control low is left unprotected. To our knowledge, this is the irst demonstration of a bitslice-redundant design of the NTT that offers distributed fault detection throughout the execution of the algorithm. 
    more » « less
  4. null (Ed.)
    Post-quantum schemes are expected to replace existing public-key schemes within a decade in billions of devices. To facilitate the transition, the US National Institute for Standards and Technology (NIST) is running a standardization process. Multivariate signatures is one of the main categories in NIST's post-quantum cryptography competition. Among the four candidates in this category, the LUOV and Rainbow schemes are based on the Oil and Vinegar scheme, first introduced in 1997 which has withstood over two decades of cryptanalysis. Beyond mathematical security and efficiency, security against side-channel attacks is a major concern in the competition. The current sentiment is that post-quantum schemes may be more resistant to fault-injection attacks due to their large key sizes and the lack of algebraic structure. We show that this is not true. We introduce a novel hybrid attack, QuantumHammer, and demonstrate it on the constant-time implementation of LUOV currently in Round 2 of the NIST post-quantum competition. The QuantumHammer attack is a combination of two attacks, a bit-tracing attack enabled via Rowhammer fault injection and a divide and conquer attack that uses bit-tracing as an oracle. Using bit-tracing, an attacker with access to faulty signatures collected using Rowhammer attack, can recover secret key bits albeit slowly. We employ a divide and conquer attack which exploits the structure in the key generation part of LUOV and solves the system of equations for the secret key more efficiently with few key bits recovered via bit-tracing. We have demonstrated the first successful in-the-wild attack on LUOV recovering all 11K key bits with less than 4 hours of an active Rowhammer attack. The post-processing part is highly parallel and thus can be trivially sped up using modest resources. QuantumHammer does not make any unrealistic assumptions, only requires software co-location (no physical access), and therefore can be used to target shared cloud servers or in other sandboxed environments. 
    more » « less
  5. null (Ed.)
    The implementation of cryptographic primitives in integrated circuits (ICs) continues to increase over the years due to the recent advancement of semiconductor manufacturing and reduction of cost per transistors. The hardware implementation makes cryptographic operations faster and more energy-efficient. However, various hardware attacks have been proposed aiming to extract the secret key in order to undermine the security of these primitives. In this paper, we focus on the widely used advanced encryption standard (AES) block cipher and demonstrate its vulnerability against tampering attack. Our proposed attack relies on implanting a hardware Trojan in the netlist by an untrusted foundry, which can design and implement such a Trojan as it has access to the design layout and mask information. The hardware Trojan's activation modifies a particular round's input data by preventing the effect of all previous rounds' key-dependent computation. We propose to use a sequential hardware Trojan to deliver the payload at the input of an internal round for achieving this modification of data. All the internal subkeys, and finally, the secret key can be computed from the observed ciphertext once the Trojan is activated. We implement our proposed tampering attack with a sequential hardware Trojan inserted into a 128-bit AES design from OpenCores benchmark suite and report the area overhead to demonstrate the feasibility of the proposed tampering attack. 
    more » « less