skip to main content


Title: Time-Space Lower Bounds for Finding Collisions in Merkle-Damgard Hash Functions
We revisit the problem of finding B-block-long collisions in Merkle-Damg˚ard Hash Functions in the auxiliary-input random oracle model, in which an attacker gets a piece of S-bit advice about the random oracle and makes T oracle queries. Akshima, Cash, Drucker and Wee (CRYPTO 2020), based on the work of Coretti, Dodis, Guo and Steinberger (EUROCRYPT 2018), showed a simple attack for 2 ≤ B ≤ T (with respect to a random salt). The attack achieves advantage Ω( e ST B/2 n + T 2/2 n) where n is the output length of the random oracle. They conjectured that this attack is optimal. However, this so-called STB conjecture was only proved for B ≈ T and B = 2. Very recently, Ghoshal and Komargodski (CRYPTO 22) confirmed STB conjecture for all constant values of B, and provided an Oe(S 4T B2/2 n + T 2/2 n) bound for all choices of B. In this work, we prove an Oe((ST B/2 n)· max{1, ST2/2 n}+T 2/2 n) bound for every 2 < B < T. Our bound confirms the STB conjecture for ST2 ≤ 2 n, and is optimal up to a factor of S for ST2 > 2 n (note as T 2 is always at most 2n, otherwise finding a collision is trivial by the birthday attack). Our result subsumes all previous upper bounds for all ranges of parameters except for B = Oe(1) and ST2 > 2 n. We obtain our results by adopting and refining the technique of Chung, Guo, Liu, and Qian (FOCS 2020). Our approach yields more modular proofs and sheds light on how to bypass the limitations of prior techniques. Along the way, we obtain a considerably simpler and illuminating proof for B = 2, recovering the main result of Akshima, Cash, Drucker and Wee.  more » « less
Award ID(s):
1925288
NSF-PAR ID:
10353638
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
IACR CRYPTO 2022
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We study collision-finding against Merkle-Damgård hashing in the random-oracle model by adversaries with an arbitrary S-bit auxiliary advice input about the random oracle and T queries. Recent work showed that such adversaries can find collisions (with respect to a random IV) with advantage 𝛺(𝑆𝑇2/2𝑛) , where n is the output length, beating the birthday bound by a factor of S. These attacks were shown to be optimal. We observe that the collisions produced are very long, on the order of T blocks, which would limit their practical relevance. We prove several results related to improving these attacks to find shorter collisions. We first exhibit a simple attack for finding B-block-long collisions achieving advantage 𝛺̃ (𝑆𝑇𝐵/2𝑛) . We then study if this attack is optimal. We show that the prior technique based on the bit-fixing model (used for the 𝑆𝑇2/2𝑛 bound) provably cannot reach this bound, and towards a general result we prove there are qualitative jumps in the optimal attacks for finding length 1, length 2, and unbounded-length collisions. Namely, the optimal attacks achieve (up to logarithmic factors) on the order of (𝑆+𝑇)/2𝑛 , 𝑆𝑇/2𝑛 and 𝑆𝑇2/2𝑛 advantage. We also give an upper bound on the advantage of a restricted class of short-collision finding attacks via a new analysis on the growth of trees in random functional graphs that may be of independent interest. 
    more » « less
  2. Lysyanskaya, Anna ; Handschuh, Helena (Ed.)
    We study the black-box function inversion problem, which is the problem of finding x[N] such that f(x)=y, given as input some challenge point y in the image of a function f:[N][N], using T oracle queries to f and preprocessed advice 01S depending on f. We prove a number of new results about this problem, as follows. 1. We show an algorithm that works for any T and S satisfying TS2maxST=(N3) . In the important setting when ST, this improves on the celebrated algorithm of Fiat and Naor [STOC, 1991], which requires TS3N3. E.g., Fiat and Naor's algorithm is only non-trivial for SN23 , while our algorithm gives a non-trivial tradeoff for any SN12 . (Our algorithm and analysis are quite simple. As a consequence of this, we also give a self-contained and simple proof of Fiat and Naor's original result, with certain optimizations left out for simplicity.) 2. We show a non-adaptive algorithm (i.e., an algorithm whose ith query xi is chosen based entirely on and y, and not on the f(x1)f(xi−1)) that works for any T and S satisfying S=(Nlog(NT)) giving the first non-trivial non-adaptive algorithm for this problem. E.g., setting T=Npolylog(N) gives S=(NloglogN). This answers a question due to Corrigan-Gibbs and Kogan [TCC, 2019], who asked whether it was possible for a non-adaptive algorithm to work with parameters T and S satisfying T+SlogNo(N) . We also observe that our non-adaptive algorithm is what we call a guess-and-check algorithm, that is, it is non-adaptive and its final output is always one of the oracle queries x1xT. For guess-and-check algorithms, we prove a matching lower bound, therefore completely characterizing the achievable parameters (ST) for this natural class of algorithms. (Corrigan-Gibbs and Kogan showed that any such lower bound for arbitrary non-adaptive algorithms would imply new circuit lower bounds.) 3. We show equivalence between function inversion and a natural decision version of the problem in both the worst case and the average case, and similarly for functions f:[N][M] with different ranges. All of the above results are most naturally described in a model with shared randomness (i.e., random coins shared between the preprocessing algorithm and the online algorithm). However, as an additional contribution, we show (using a technique from communication complexity due to Newman [IPL, 1991]) how to generically convert any algorithm that uses shared randomness into one that does not. 
    more » « less
  3. Memory-hard functions (MHFs) are a key cryptographic primitive underlying the design of moderately expensive password hashing algorithms and egalitarian proofs of work. Over the past few years several increasingly stringent goals for an MHF have been proposed including the requirement that the MHF have high sequential space-time (ST) complexity, parallel space-time complexity, amortized area-time (aAT) complexity and sustained space complexity. Data-Independent Memory Hard Functions (iMHFs) are of special interest in the context of password hashing as they naturally resist side-channel attacks. iMHFs can be specified using a directed acyclic graph (DAG) $G$ with $N=2^n$ nodes and low indegree and the complexity of the iMHF can be analyzed using a pebbling game. Recently, Alwen et al. [CCS'17] constructed an DAG called DRSample which has aAT complexity at least $\Omega\left( N^2/\log N\right)$. Asymptotically DRSample outperformed all prior iMHF constructions including Argon2i, winner of the password hashing competition (aAT cost $\mathcal{O}\left(N^{1.767}\right)$), though the constants in these bounds are poorly understood. We show that the the greedy pebbling strategy of Boneh et al. [ASIACRYPT'16] is particularly effective against DRSample e.g., the aAT cost is $\mathcal{O}\left( N^2/\log N\right)$. In fact, our empirical analysis {\em reverses} the prior conclusion of Alwen et al. that DRSample provides stronger resistance to known pebbling attacks for practical values of $N \leq 2^{24}$. We construct a new iMHF candidate (DRSample+BRG) by using the bit-reversal graph to extend DRSample. We then prove that the construction is asymptotically optimal under every MHF criteria, and we empirically demonstrate that our iMHF provides the best resistance to {\em known} pebbling attacks. For example, we show that any parallel pebbling attack either has aAT cost $\omega(N^2)$ or requires at least $\Omega(N)$ steps with $\Omega(N/\log N)$ pebbles on the DAG. This makes our construction the first practical iMHF with a strong sustained space-complexity guarantee and immediately implies that any parallel pebbling has aAT complexity $\Omega(N^2/\log N)$. We also prove that any sequential pebbling (including the greedy pebbling attack) has aAT cost $\Omega\left( N^2\right)$ and, if a plausible conjecture holds, any parallel pebbling has aAT cost $\Omega(N^2 \log \log N/\log N)$ --- the best possible bound for an iMHF. We implement our new iMHF and demonstrate that it is just as fast as Argon2. Along the way we propose a simple modification to the Argon2 round function which increases an attacker's aAT cost by nearly an order of magnitude without increasing running time on a CPU. Finally, we give a pebbling reduction which proves that in the parallel random oracle model (PROM) the cost of evaluating an iMHF like Argon2i or DRSample+BRG is given by the pebbling cost of the underlying DAG. Prior pebbling reductions assumed that the iMHF round function concatenates input labels before hashing and did not apply to practical iMHFs such as Argon2i, DRSample or DRSample+BRG where input labels are instead XORed together. 
    more » « less
  4. We give two new quantum algorithms for solving semidefinite programs (SDPs) providing quantum speed-ups. We consider SDP instances with m constraint matrices, each of dimension n, rank at most r, and sparsity s. The first algorithm assumes an input model where one is given access to an oracle to the entries of the matrices at unit cost. We show that it has run time O~(s^2 (sqrt{m} epsilon^{-10} + sqrt{n} epsilon^{-12})), with epsilon the error of the solution. This gives an optimal dependence in terms of m, n and quadratic improvement over previous quantum algorithms (when m ~~ n). The second algorithm assumes a fully quantum input model in which the input matrices are given as quantum states. We show that its run time is O~(sqrt{m}+poly(r))*poly(log m,log n,B,epsilon^{-1}), with B an upper bound on the trace-norm of all input matrices. In particular the complexity depends only polylogarithmically in n and polynomially in r. We apply the second SDP solver to learn a good description of a quantum state with respect to a set of measurements: Given m measurements and a supply of copies of an unknown state rho with rank at most r, we show we can find in time sqrt{m}*poly(log m,log n,r,epsilon^{-1}) a description of the state as a quantum circuit preparing a density matrix which has the same expectation values as rho on the m measurements, up to error epsilon. The density matrix obtained is an approximation to the maximum entropy state consistent with the measurement data considered in Jaynes' principle from statistical mechanics. As in previous work, we obtain our algorithm by "quantizing" classical SDP solvers based on the matrix multiplicative weight update method. One of our main technical contributions is a quantum Gibbs state sampler for low-rank Hamiltonians, given quantum states encoding these Hamiltonians, with a poly-logarithmic dependence on its dimension, which is based on ideas developed in quantum principal component analysis. We also develop a "fast" quantum OR lemma with a quadratic improvement in gate complexity over the construction of Harrow et al. [Harrow et al., 2017]. We believe both techniques might be of independent interest. 
    more » « less
  5. The codewords of the homomorphism code aHom(G,H) are the affine homomorphisms between two finite groups, G and H, generalizing Hadamard codes. Following the work of Goldreich-Levin (1989), Grigorescu et al. (2006), Dinur et al. (2008), and Guo and Sudan (2014), we further expand the range of groups for which local list-decoding is possible up to mindist, the minimum distance of the code. In particular, for the first time, we do not require either G or H to be solvable. Specifically, we demonstrate a poly(1/epsilon) bound on the list size, i. e., on the number of codewords within distance (mindist-epsilon) from any received word, when G is either abelian or an alternating group, and H is an arbitrary (finite or infinite) group. We conjecture that a similar bound holds for all finite simple groups as domains; the alternating groups serve as the first test case. The abelian vs. arbitrary result permits us to adapt previous techniques to obtain efficient local list-decoding for this case. We also obtain efficient local list-decoding for the permutation representations of alternating groups (the codomain is a symmetric group) under the restriction that the domain G=A_n is paired with codomain H=S_m satisfying m < 2^{n-1}/sqrt{n}. The limitations on the codomain in the latter case arise from severe technical difficulties stemming from the need to solve the homomorphism extension (HomExt) problem in certain cases; these are addressed in a separate paper (Wuu 2018). We introduce an intermediate "semi-algorithmic" model we call Certificate List-Decoding that bypasses the HomExt bottleneck and works in the alternating vs. arbitrary setting. A certificate list-decoder produces partial homomorphisms that uniquely extend to the homomorphisms in the list. A homomorphism extender applied to a list of certificates yields the desired list. 
    more » « less