As the cryptographic community turns its focus toward post-quantum cryptography, the demand for classical cryptographic schemes such as Elliptic Curve Cryptography (ECC) remains high. In this work, we present an optimized implementation of the Edwards Curve Digital Signature Algorithm (EdDSA) operations Keygen, Sign, and Verify using the Ed25519 parameter on the ARM Cortex-M4 using optimized assembly code. We discuss the optimization of field and group arithmetic to produce high-throughput cryptographic primitives. Then, we present the first SCA-resistant implementation of the Signed Comb method, and Test Vector Leakage Assessment (TVLA) measurements. Our fastest implementation performs Ed25519 Keygen in 200,000 cycles, Sign in 240,000 cycles, and Verify in 720,000 cycles on the ARM Cortex-M4.
more »
« less
Efficient and Side-Channel Resistant Ed25519 on ARM Cortex-M4
An estimated 14.7 billion Internet of Things (IoT) devices will be connected to the Internet by 2023. The ubiquity of these devices creates exciting new opportunities, while at the same time introducing new concerns about privacy and security. To address these concerns, efficient cryptographic algorithms are needed to secure communication between IoT devices. In this work, we present an optimized implementation of one such algorithm, the Edwards Curve Digital Signature Algorithm (EdDSA) with operations Keygen, Sign, and Verify using the Ed25519 parameter on the ARM Cortex-M4 implemented in assembly code. The ARM Cortex-M4 is used in millions of devices world-wide, and is a popular choice for a wide range of IoT applications. We discuss the optimization of field and group arithmetic on this platform to produce high-throughput cryptographic primitives. Then, we present the first SCA-resistant implementation of the Signed Comb method, and Test Vector Leakage Assessment (TVLA) measurements. Our fastest implementation performs Ed25519 Keygen in 200,000 cycles, Sign in 240,000 cycles, and Verify in 720,000 cycles on the ARM Cortex-M4.
more »
« less
- Award ID(s):
- 2147196
- PAR ID:
- 10586515
- Publisher / Repository:
- IEEE
- Date Published:
- Journal Name:
- IEEE transactions on circuits and systems I Regular papers
- ISSN:
- 1549-8328
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The elliptic curve family of schemes has the lowest computational latency, memory use, energy consumption, and bandwidth requirements, making it the most preferred public key method for adoption into network protocols. Being suitable for embedded devices and applicable for key exchange and authentication, ECC is assuming a prominent position in the field of IoT cryptography. The attractive properties of the relatively new curve Curve448 contribute to its inclusion in the TLS1.3 protocol and pique the interest of academics and engineers aiming at studying and optimizing the schemes. When addressing low-end IoT devices, however, the literature indicates little work on these curves. In this paper, we present an efficient design for both protocols based on Montgomery curve Curve448 and its birationally equivalent Edwards curve Ed448 used for key agreement and digital signature algorithm, specifically the X448 function and the Ed448 DSA, relying on efficient lowlevel arithmetic operations targeting the ARM-based Cortex-M4 platform. Our design performs point multiplication, the base of the Elliptic Curve Diffie-Hellman (ECDH), in 3,2KCCs, resulting in more than 48% improvement compared to the best previous work based on Curve448, and performs sign and verify, the main operations of the Edwards-curves Digital Signature Algorithm (EdDSA), in 6,038KCCs and 7,404KCCs, showing a speedup of around 11% compared to the counterparts. We present novel modular multiplication and squaring architectures reaching 25% and s 35% faster runtime than the previous best-reported results, respectively, based on Curve448 key exchange counterparts, and s 13% and s 25% better latency results than the Ed448-based digital signature counterparts targeting Cortex-M4 platform.more » « less
-
The elliptic curve family of schemes has the lowest computational latency, memory use, energy consumption, and bandwidth requirements, making it the most preferred public key method for adoption into network protocols. Being suitable for embedded devices and applicable for key exchange and authentication, ECC is assuming a prominent position in the field of IoT cryptography. The attractive properties of the relatively new curve Curve448 contribute to its inclusion in the TLS1.3 protocol and pique the interest of academics and engineers aiming at studying and optimizing the schemes. When addressing low-end IoT devices, however, the literature indicates little work on these curves. In this paper, we present an efficient design for both protocols based on Montgomery curve Curve448 and its birationally equivalent Edwards curve Ed448 used for key agreement and digital signature algorithm, specifically the X448 function and the Ed448 DSA, relying on efficient low-level arithmetic operations targeting the ARM-based Cortex-M4 platform. Our design performs point multiplication, the base of the Elliptic Curve Diffie-Hellman (ECDH), in 3,2KCCs, resulting in more than 48% improvement compared to the best previous work based on Curve448, and performs sign and verify, the main operations of the Edwards-curves Digital Signature Algorithm (EdDSA), in 6,038KCCs and 7,404KCCs, showing a speedup of around 11% compared to the counterparts. We present novel modular multiplication and squaring architectures reaching ∼25% and ∼35% faster runtime than the previous best-reported results, respectively, based on Curve448 key exchange counterparts, and ∼13% and ∼25% better latency results than the Ed448-based digital signature counterparts targeting Cortex-M4 platform.more » « less
-
The highly secure Curve448 cryptographic algorithm has been recently recommended by NIST. While this algorithm provides 224-bit security over elliptic curve cryptography, its implementation may still be vulnerable to physical sidechannel attacks. In this paper, we present a speed-optimized implementation on a 32-bit ARM Cortex-M4 platform achieving more than 40% improvement compared to the best previous work. Our design can perform 43 scalar multiplications per second on an STM32F4 working at 168 MHz. At 24 MHz, our proposed implementation takes only 3,740k clock cycles. On the other hand, the security of Curve448 is thoroughly evaluated to have a trade-off between performance and required protection. We apply different effective countermeasures to prevent a subset of side-channel and fault injection attacks at the cost of 8%-22% overhead.more » « less
-
The demand for classical cryptography schemes continues to increase due to the exhaustive studies on their security. Thus, constant improvement of timing, power consumption, and memory requirements are needed for the most widely used classical Elliptic Curve Cryptography (ECC) primitives, suiting high- as well as low-end devices. In this work, we present the first implementation of the Edwards Curve Digital Signature Algorithm (EdDSA) based on the Ed448 targeting the ARM Cortex-M4-based STM32F407VG microcontroller, which forms a large part of the Internet of Things (IoT) world. We report timing and memory consumption results based on portable C and targetspecific hand-crafted assembly code implementations of the lowlevel finite filed arithmetics. We optimize the high-level group operations by implementing the efficient scalar multiplication over the Ed448 isogenous map to reduce the computation complexity. Furthermore, we provide a side-channel analysis (SCA) and fault attack protected design by developing point randomization, scalar blinding techniques, and repeated signature, and evaluate the performance. Our optimized architecture performs a signature and verification in 39.88ms and 51.54ms, respectively, where SCA protection can be achieved at less than 6.4% cost of performance overhead.more » « less
An official website of the United States government

