skip to main content

Title: Data-Independent Memory Hard Functions: New Attacks and Stronger Constructions
Memory-hard functions (MHFs) are a key cryptographic primitive underlying the design of moderately expensive password hashing algorithms and egalitarian proofs of work. Over the past few years several increasingly stringent goals for an MHF have been proposed including the requirement that the MHF have high sequential space-time (ST) complexity, parallel space-time complexity, amortized area-time (aAT) complexity and sustained space complexity. Data-Independent Memory Hard Functions (iMHFs) are of special interest in the context of password hashing as they naturally resist side-channel attacks. iMHFs can be specified using a directed acyclic graph (DAG) $G$ with $N=2^n$ nodes and low indegree and the complexity of the iMHF can be analyzed using a pebbling game. Recently, Alwen et al. [CCS'17] constructed an DAG called DRSample which has aAT complexity at least $\Omega\left( N^2/\log N\right)$. Asymptotically DRSample outperformed all prior iMHF constructions including Argon2i, winner of the password hashing competition (aAT cost $\mathcal{O}\left(N^{1.767}\right)$), though the constants in these bounds are poorly understood. We show that the the greedy pebbling strategy of Boneh et al. [ASIACRYPT'16] is particularly effective against DRSample e.g., the aAT cost is $\mathcal{O}\left( N^2/\log N\right)$. In fact, our empirical analysis {\em reverses} the prior conclusion of Alwen et al. that DRSample provides more » stronger resistance to known pebbling attacks for practical values of $N \leq 2^{24}$. We construct a new iMHF candidate (DRSample+BRG) by using the bit-reversal graph to extend DRSample. We then prove that the construction is asymptotically optimal under every MHF criteria, and we empirically demonstrate that our iMHF provides the best resistance to {\em known} pebbling attacks. For example, we show that any parallel pebbling attack either has aAT cost $\omega(N^2)$ or requires at least $\Omega(N)$ steps with $\Omega(N/\log N)$ pebbles on the DAG. This makes our construction the first practical iMHF with a strong sustained space-complexity guarantee and immediately implies that any parallel pebbling has aAT complexity $\Omega(N^2/\log N)$. We also prove that any sequential pebbling (including the greedy pebbling attack) has aAT cost $\Omega\left( N^2\right)$ and, if a plausible conjecture holds, any parallel pebbling has aAT cost $\Omega(N^2 \log \log N/\log N)$ --- the best possible bound for an iMHF. We implement our new iMHF and demonstrate that it is just as fast as Argon2. Along the way we propose a simple modification to the Argon2 round function which increases an attacker's aAT cost by nearly an order of magnitude without increasing running time on a CPU. Finally, we give a pebbling reduction which proves that in the parallel random oracle model (PROM) the cost of evaluating an iMHF like Argon2i or DRSample+BRG is given by the pebbling cost of the underlying DAG. Prior pebbling reductions assumed that the iMHF round function concatenates input labels before hashing and did not apply to practical iMHFs such as Argon2i, DRSample or DRSample+BRG where input labels are instead XORed together. « less
Authors:
; ; ; ; ;
Award ID(s):
1755708
Publication Date:
NSF-PAR ID:
10092895
Journal Name:
Crypto
ISSN:
1790-2703
Sponsoring Org:
National Science Foundation
More Like this
  1. Memory-hard functions (MHF) are functions whose evaluation cost is dominated by memory cost. MHFs are egalitarian, in the sense that evaluating them on dedicated hardware (like FPGAs or ASICs) is not much cheaper than on off-the-shelf hardware (like x86 CPUs). MHFs have interesting cryptographic applications, most notably to password hashing and securing blockchains. Alwen and Serbinenko [STOC’15] define the cumulative memory complexity (cmc) of a function as the sum (over all time-steps) of the amount of memory required to compute the function. They advocate that a good MHF must have high cmc. Unlike previous notions, cmc takes into account thatmore »dedicated hardware might exploit amortization and parallelism. Still, cmc has been critizised as insufficient, as it fails to capture possible time-memory trade-offs; as memory cost doesn’t scale linearly, functions with the same cmc could still have very different actual hardware cost. In this work we address this problem, and introduce the notion of sustained-memory complexity, which requires that any algorithm evaluating the function must use a large amount of memory for many steps. We construct functions (in the parallel random oracle model) whose sustained-memory complexity is almost optimal: our function can be evaluated using n steps and O(n/log(n)) memory, in each step making one query to the (fixed-input length) random oracle, while any algorithm that can make arbitrary many parallel queries to the random oracle, still needs Ω(n/log(n)) memory for Ω(n) steps. As has been done for various notions (including cmc) before, we reduce the task of constructing an MHFs with high sustained-memory complexity to proving pebbling lower bounds on DAGs. Our main technical contribution is the construction is a family of DAGs on n nodes with constant indegree with high “sustained-space complexity”, meaning that any parallel black-pebbling strategy requires Ω(n/log(n)) pebbles for at least Ω(n) steps. Along the way we construct a family of maximally “depth-robust” DAGs with maximum indegree O(logn) , improving upon the construction of Mahmoody et al. [ITCS’13] which had maximum indegree O(log2n⋅polylog(logn)) .« less
  2. Memory-hard functions (MHF) are functions whose evaluation cost is dominated by memory cost. MHFs are egalitarian, in the sense that evaluating them on dedicated hardware (like FPGAs or ASICs) is not much cheaper than on off-the-shelf hardware (like x86 CPUs). MHFs have interesting cryptographic applications, most notably to password hashing and securing blockchains. Alwen and Serbinenko [STOC'15] define the cumulative memory complexity (cmc) of a function as the sum (over all time-steps) of the amount of memory required to compute the function. They advocate that a good MHF must have high cmc. Unlike previous notions, cmc takes into account thatmore »dedicated hardware might exploit amortization and parallelism. Still, cmc has been critizised as insufficient, as it fails to capture possible time-memory trade-offs, as memory cost doesn't scale linearly, functions with the same cmc could still have very different actual hardware cost. In this work we address this problem, and introduce the notion of sustained-memory complexity, which requires that any algorithm evaluating the function must use a large amount of memory for many steps. We construct functions (in the parallel random oracle model) whose sustained-memory complexity is almost optimal: our function can be evaluated using n steps and O(n/log(n)) memory, in each step making one query to the (fixed-input length) random oracle, while any algorithm that can make arbitrary many parallel queries to the random oracle, still needs Ω(n/log(n)) memory for Ω(n) steps. Our main technical contribution is the construction is a family of DAGs on n nodes with constant indegree with high "sustained-space complexity", meaning that any parallel black-pebbling strategy requires Ω(n/log(n)) pebbles for at least Ω(n) steps.« less
  3. Memory-hard functions (MHF) are functions whose evaluation cost is dominated by memory cost. MHFs are egalitarian, in the sense that evaluating them on dedicated hardware (like FPGAs or ASICs) is not much cheaper than on off-the-shelf hardware (like x86 CPUs). MHFs have interesting cryptographic applications, most notably to password hashing and securing blockchains. Alwen and Serbinenko [STOC'15] define the cumulative memory complexity (cmc) of a function as the sum (over all time-steps) of the amount of memory required to compute the function. They advocate that a good MHF must have high cmc. Unlike previous notions, cmc takes into account thatmore »dedicated hardware might exploit amortization and parallelism. Still, cmc has been critizised as insufficient, as it fails to capture possible time-memory trade-offs, as memory cost doesn't scale linearly, functions with the same cmc could still have very different actual hardware cost. In this work we address this problem, and introduce the notion of sustained-memory complexity, which requires that any algorithm evaluating the function must use a large amount of memory for many steps. We construct functions (in the parallel random oracle model) whose sustained-memory complexity is almost optimal: our function can be evaluated using n steps and O(n/log(n)) memory, in each step making one query to the (fixed-input length) random oracle, while any algorithm that can make arbitrary many parallel queries to the random oracle, still needs Ω(n/log(n)) memory for Ω(n) steps. Our main technical contribution is the construction is a family of DAGs on n nodes with constant indegree with high "sustained-space complexity", meaning that any parallel black-pebbling strategy requires Ω(n/log(n)) pebbles for at least Ω(n) steps.« less
  4. Memory Hard Functions (MHFs) have been proposed as an answer to the growing inequality between the computational speed of general purpose CPUs and Application Specific Integrated Circuits (ASICs). MHFs have seen widespread applications including password hashing, key stretching and proofs of work. Several metrics have been proposed to quantify the “memory hardness” of a function. Cumulative memory complexity (CMC) [8] (or amortized Area × Time complexity [4]) attempts to quantify the cost to acquire/build the hardware to evaluate the function — after normalizing the time it takes to evaluate the function. By contrast, bandwidth hardness [30] attempts to quantify themore »amortized energy costs of evaluating this function on hardware — which in turn is largely dominated by the number of cache misses. Ideally, a good MHF would be both bandwidth hard and have high cumulative memory complexity. While the cumulative memory complexity of leading MHF candidates is well understood, little is known about the bandwidth hardness of many prominent MHF candidates. Our contributions are as follows: First, we provide the first reduction proving that, in the parallel random oracle model, the bandwidth hardness of a Data-Independent Memory Hard Function (iMHF) is described by the red-blue pebbling cost of the directed acyclic graph (DAG) associated with that iMHF. Second, we show that the goals of designing an MHF with high CMC/bandwidth hardness are well aligned. In particular, we prove that any function with high CMC also has relatively high energy costs. This result leads to the first unconditional lower bound on the energy cost of scrypt in the parallel random oracle model. Third, we analyze the bandwidth hardness of several prominent iMHF candidates such as Argon2i [11], winner of the password hashing competition, aATSample and DRSample [4] — the first practical iMHF with essentially asymptotically optimal CMC. We show Argon2i, aATSample and DRSample are maximally bandwidth hard under appropriate cache size. Finally, we show that the problem of finding a red-blue pebbling with minimum energy cost is NP-hard.« less
  5. The cumulative pebbling complexity of a directed acyclic graph G is defined as cc(G) = min_P ∑_i |P_i|, where the minimum is taken over all legal (parallel) black pebblings of G and |P_i| denotes the number of pebbles on the graph during round i. Intuitively, cc(G) captures the amortized Space-Time complexity of pebbling m copies of G in parallel. The cumulative pebbling complexity of a graph G is of particular interest in the field of cryptography as cc(G) is tightly related to the amortized Area-Time complexity of the Data-Independent Memory-Hard Function (iMHF) f_{G,H} [Joël Alwen and Vladimir Serbinenko, 2015] definedmore »using a constant indegree directed acyclic graph (DAG) G and a random oracle H(⋅). A secure iMHF should have amortized Space-Time complexity as high as possible, e.g., to deter brute-force password attacker who wants to find x such that f_{G,H}(x) = h. Thus, to analyze the (in)security of a candidate iMHF f_{G,H}, it is crucial to estimate the value cc(G) but currently, upper and lower bounds for leading iMHF candidates differ by several orders of magnitude. Blocki and Zhou recently showed that it is NP-Hard to compute cc(G), but their techniques do not even rule out an efficient (1+ε)-approximation algorithm for any constant ε>0. We show that for any constant c > 0, it is Unique Games hard to approximate cc(G) to within a factor of c. Along the way, we show the hardness of approximation of the DAG Vertex Deletion problem on DAGs of constant indegree. Namely, we show that for any k,ε >0 and given a DAG G with N nodes and constant indegree, it is Unique Games hard to distinguish between the case that G is (e_1, d_1)-reducible with e_1=N^{1/(1+2 ε)}/k and d_1=k N^{2 ε/(1+2 ε)}, and the case that G is (e_2, d_2)-depth-robust with e_2 = (1-ε)k e_1 and d_2= 0.9 N^{(1+ε)/(1+2 ε)}, which may be of independent interest. Our result generalizes a result of Svensson who proved an analogous result for DAGs with indegree 𝒪(N).« less