skip to main content


Search for: All records

Award ID contains: 2047272

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. We revisit computationally relaxed locally decodable codes (crLDCs) (Blocki et al., Trans. Inf. Theory ’21) and give two new constructions. Our first construction is a Hamming crLDC that is conceptually simpler than prior constructions, leveraging digital signature schemes and an appropriately chosen Hamming code. Our second construction is an extension of our Hamming crLDC to handle insertion-deletion (InsDel) errors, yielding an InsDel crLDC. This extension crucially relies on the noisy binary search techniques of Block et al. (FSTTCS ’20) to handle InsDel errors. Both crLDC constructions have binary codeword alphabets, are resilient to a constant fraction of Hamming and InsDel errors, respectively, and under suitable parameter choices have poly-logarithmic locality and encoding length linear in the message length and polynomial in the security parameter. These parameters compare favorably to prior constructions in the poly-logarithmic locality regime. 
    more » « less
    Free, publicly-accessible full text available June 25, 2024
  2. The data management of large companies often prioritize more recent data, as a source of higher accuracy prediction than outdated data. For example, the Facebook data policy retains user search histories for months while the Google data retention policy states that browser information may be stored for up to months. These policies are captured by the sliding window model, in which only the most recent statistics form the underlying dataset. In this paper, we consider the problem of privately releasing the L2-heavy hitters in the sliding window model, which include Lp-heavy hitters for p<=2 and in some sense are the strongest possible guarantees that can be achieved using polylogarithmic space, but cannot be handled by existing techniques due to the sub-additivity of the L2 norm. Moreover, existing non-private sliding window algorithms use the smooth histogram framework, which has high sensitivity. To overcome these barriers, we introduce the first differentially private algorithm for L2-heavy hitters in the sliding window model by initiating a number of L2-heavy hitter algorithms across the stream with significantly lower threshold. Similarly, we augment the algorithms with an approximate frequency tracking algorithm with significantly higher accuracy. We then use smooth sensitivity and statistical distance arguments to show that we can add noise proportional to an estimation of the norm. To the best of our knowledge, our techniques are the first to privately release statistics that are related to a sub-additive function in the sliding window model, and may be of independent interest to future differentially private algorithmic design in the sliding window model. 
    more » « less
  3. Ta-Shma, Amnon (Ed.)
    Locally Decodable Codes (LDCs) are error-correcting codes C:Σⁿ → Σ^m, encoding messages in Σⁿ to codewords in Σ^m, with super-fast decoding algorithms. They are important mathematical objects in many areas of theoretical computer science, yet the best constructions so far have codeword length m that is super-polynomial in n, for codes with constant query complexity and constant alphabet size. In a very surprising result, Ben-Sasson, Goldreich, Harsha, Sudan, and Vadhan (SICOMP 2006) show how to construct a relaxed version of LDCs (RLDCs) with constant query complexity and almost linear codeword length over the binary alphabet, and used them to obtain significantly-improved constructions of Probabilistically Checkable Proofs. In this work, we study RLDCs in the standard Hamming-error setting, and introduce their variants in the insertion and deletion (Insdel) error setting. Standard LDCs for Insdel errors were first studied by Ostrovsky and Paskin-Cherniavsky (Information Theoretic Security, 2015), and are further motivated by recent advances in DNA random access bio-technologies. Our first result is an exponential lower bound on the length of Hamming RLDCs making 2 queries (even adaptively), over the binary alphabet. This answers a question explicitly raised by Gur and Lachish (SICOMP 2021) and is the first exponential lower bound for RLDCs. Combined with the results of Ben-Sasson et al., our result exhibits a "phase-transition"-type behavior on the codeword length for some constant-query complexity. We achieve these lower bounds via a transformation of RLDCs to standard Hamming LDCs, using a careful analysis of restrictions of message bits that fix codeword bits. We further define two variants of RLDCs in the Insdel-error setting, a weak and a strong version. On the one hand, we construct weak Insdel RLDCs with almost linear codeword length and constant query complexity, matching the parameters of the Hamming variants. On the other hand, we prove exponential lower bounds for strong Insdel RLDCs. These results demonstrate that, while these variants are equivalent in the Hamming setting, they are significantly different in the insdel setting. Our results also prove a strict separation between Hamming RLDCs and Insdel RLDCs. 
    more » « less
  4. A central challenge in password security is to characterize the attacker's guessing curve i.e., what is the probability that the attacker will crack a random user's password within the first G guesses. A key challenge is that the guessing curve depends on the attacker's guessing strategy and the distribution of user passwords both of which are unknown to us. In this work we aim to follow Kerckhoffs's principal and analyze the performance of an optimal attacker who knows the password distribution. Let \lambda_G denote the probability that such an attacker can crack a random user's password within G guesses. We develop several statistically rigorous techniques to upper and lower bound \lambda_G given N independent samples from the unknown password distribution P. We show that our upper/lower bounds on \lambda_G hold with high confidence and we apply our techniques to analyze eight large password datasets. Our empirical analysis shows that even state-of-the-art password cracking models are often significantly less guess efficient than an attacker who can optimize its attack based on its (partial) knowledge of the password distribution. We also apply our statistical tools to re-examine different models of the password distribution i.e., the empirical password distribution and Zipf's Law. We find that the empirical distribution closely matches our upper/lower bounds on \lambda_G when the guessing number G is not too large i.e., G << N. However, for larger values of G our empirical analysis rigorously demonstrates that the empirical distribution (resp. Zipf's Law) overestimates the attacker's success rate. We apply our statistical techniques to upper/lower bound the effectiveness of password throttling mechanisms (key-stretching) which are used to reduce the number of attacker guesses G. Finally, if we are willing to make an additional assumption about the way users respond to password restrictions, we can use our statistical techniques to evaluate the effectiveness of various password composition policies which restrict the passwords that users may select. 
    more » « less
  5. Kiltz, E. (Ed.)
    The classical (parallel) black pebbling game is a useful abstraction which allows us to analyze the resources (space, space-time, cumulative space) necessary to evaluate a function f with a static data-dependency graph G. Of particular interest in the field of cryptography are data-independent memory-hard functions fG,H which are defined by a directed acyclic graph (DAG) G and a cryptographic hash function H. The pebbling complexity of the graph G characterizes the amortized cost of evaluating fG,H multiple times as well as the total cost to run a brute-force preimage attack over a fixed domain X, i.e., given y∈{0,1}∗ find x∈X such that fG,H(x)=y. While a classical attacker will need to evaluate the function fG,H at least m=|X| times a quantum attacker running Grover’s algorithm only requires O(m−−√) blackbox calls to a quantum circuit CG,H evaluating the function fG,H. Thus, to analyze the cost of a quantum attack it is crucial to understand the space-time cost (equivalently width times depth) of the quantum circuit CG,H. We first observe that a legal black pebbling strategy for the graph G does not necessarily imply the existence of a quantum circuit with comparable complexity—in contrast to the classical setting where any efficient pebbling strategy for G corresponds to an algorithm with comparable complexity for evaluating fG,H. Motivated by this observation we introduce a new parallel reversible pebbling game which captures additional restrictions imposed by the No-Deletion Theorem in Quantum Computing. We apply our new reversible pebbling game to analyze the reversible space-time complexity of several important graphs: Line Graphs, Argon2i-A, Argon2i-B, and DRSample. Specifically, (1) we show that a line graph of size N has reversible space-time complexity at most O(N^{1+2/√logN}). (2) We show that any (e, d)-reducible DAG has reversible space-time complexity at most O(Ne+dN2^d). In particular, this implies that the reversible space-time complexity of Argon2i-A and Argon2i-B are at most O(N^2 loglogN/√logN) and O(N^2/(log N)^{1/3}), respectively. (3) We show that the reversible space-time complexity of DRSample is at most O((N^2loglog N)/log N). We also study the cumulative pebbling cost of reversible pebblings extending a (non-reversible) pebbling attack of Alwen and Blocki on depth-reducible graphs. 
    more » « less
  6. Dodis, Y. (Ed.)
    Memory-hard functions (MHFs) are a useful cryptographic primitive which can be used to design egalitarian proof of work puzzles and to protect low entropy secrets like passwords against brute-force attackers. Intuitively, a memory-hard function is a function whose evaluation costs are dominated by memory costs even if the attacker uses specialized hardware (FPGAs/ASICs), and several cost metrics have been proposed to quantify this intuition. For example, space-time cost looks at the product of running time and the maximum space usage over the entire execution of an algorithm. Alwen and Serbinenko (STOC 2015) observed that the space-time cost of evaluating a function multiple times may not scale linearly in the number of instances being evaluated and introduced the stricter requirement that a memory-hard function has high cumulative memory complexity (CMC) to ensure that an attacker’s amortized space-time costs remain large even if the attacker evaluates the function on multiple different inputs in parallel. Alwen et al. (EUROCRYPT 2018) observed that the notion of CMC still gives the attacker undesirable flexibility in selecting space-time tradeoffs e.g., while the MHF Scrypt has maximal CMC Ω(N^2), an attacker could evaluate the function with constant O(1) memory in time O(N^2). Alwen et al. introduced an even stricter notion of Sustained Space complexity and designed an MHF which has s=Ω(N/logN) sustained complexity t=Ω(N) i.e., any algorithm evaluating the function in the parallel random oracle model must have at least t=Ω(N) steps where the memory usage is at least Ω(N/logN). In this work, we use dynamic pebbling games and dynamic graphs to explore tradeoffs between sustained space complexity and cumulative memory complexity for data-dependent memory-hard functions such as Argon2id and Scrypt. We design our own dynamic graph (dMHF) with the property that any dynamic pebbling strategy either (1) has Ω(N) rounds with Ω(N) space, or (2) has CMC Ω(N^{3−ϵ})—substantially larger than N^2. For Argon2id we show that any dynamic pebbling strategy either(1) has Ω(N) rounds with Ω(N^{1−ϵ}) space, or (2) has CMC ω(N^2). We also present a dynamic version of DRSample (Alwen et al. 2017) for which any dynamic pebbling strategy either (1) has Ω(N) rounds with Ω(N/log N) space, or (2) has CMC Ω(N^3/log N). 
    more » « less
  7. Galdi, C ; Jarecki, S. (Ed.)
    In the past decade billions of user passwords have been exposed to the dangerous threat of offline password cracking attacks. An offline attacker who has stolen the cryptographic hash of a user’s password can check as many password guesses as s/he likes limited only by the resources that s/he is willing to invest to crack the password. Pepper and key-stretching are two techniques that have been proposed to deter an offline attacker by increasing guessing costs. Pepper ensures that the cost of rejecting an incorrect password guess is higher than the (expected) cost of verifying a correct password guess. This is useful because most of the offline attacker’s guesses will be incorrect. Unfortunately, as we observe the traditional peppering defense seems to be incompatible with modern memory hard key-stretching algorithms such as Argon2 or Scrypt. We introduce an alternative to pepper which we call Cost-Asymmetric Memory Hard Password Authentication which benefits from the same cost-asymmetry as the classical peppering defense i.e., the cost of rejecting an incorrect password guess is larger than the expected cost to authenticate a correct password guess. When configured properly we prove that our mechanism can only reduce the percentage of user passwords that are cracked by a rational offline attacker whose goal is to maximize (expected) profit i.e., the total value of cracked passwords minus the total guessing costs. We evaluate the effectiveness of our mechanism on empirical password datasets against a rational offline attacker. Our empirical analysis shows that our mechanism can significantly reduce the percentage of user passwords that are cracked by a rational attacker by up to 10%. 
    more » « less
  8. Locally Decodable Codes (LDCs) are error-correcting codes for which individual message symbols can be quickly recovered despite errors in the codeword. LDCs for Hamming errors have been studied extensively in the past few decades, where a major goal is to understand the amount of redundancy that is necessary and sufficient to decode from large amounts of error, with small query complexity. Despite exciting progress, we still don't have satisfactory answers in several important parameter regimes. For example, in the case of 3-query LDCs, the gap between existing constructions and lower bounds is superpolynomial in the message length. In this work we study LDCs for insertion and deletion errors, called Insdel LDCs. Their study was initiated by Ostrovsky and Paskin-Cherniavsky (Information Theoretic Security, 2015), who gave a reduction from Hamming LDCs to Insdel LDCs with a small blowup in the code parameters. On the other hand, the only known lower bounds for Insdel LDCs come from those for Hamming LDCs, thus there is no separation between them. Here we prove new, strong lower bounds for the existence of Insdel LDCs. In particular, we show that 2-query linear Insdel LDCs do not exist, and give an exponential lower bound for the length of all q-query Insdel LDCs with constant q. For q ≥ 3 our bounds are exponential in the existing lower bounds for Hamming LDCs. Furthermore, our exponential lower bounds continue to hold for adaptive decoders, and even in private-key settings where the encoder and decoder share secret randomness. This exhibits a strict separation between Hamming LDCs and Insdel LDCs. Our strong lower bounds also hold for the related notion of Insdel LCCs (except in the private-key setting), due to an analogue to the Insdel notions of a reduction from Hamming LCCs to LDCs. Our techniques are based on a delicate design and analysis of hard distributions of insertion and deletion errors, which depart significantly from typical techniques used in analyzing Hamming LDCs. 
    more » « less
  9. Large-scale online password guessing attacks are widespread and pose a persistant privacy and security threat to users. The common method for mitigating the risk of online cracking is to lock out the user after a fixed number ($K$) of consecutive incorrect login attempts. Selecting the value of $K$ induces a classic security-usability trade-off. When $K$ is too large, a hacker can (quickly) break into a significant fraction of user accounts, but when $K$ is too low, we will start to annoy honest users by locking them out after a few mistakes. Motivated by the observation that honest user mistakes typically look quite different from an online attacker's password guesses, we introduce $\DALock$, a {\em distribution-aware} password lockout mechanism to reduce user annoyance while minimizing user risk. As the name suggests, $\DALock$ is designed to be aware of the frequency and popularity of the password used for login attacks. At the same time, standard throttling mechanisms (e.g., $K$-strikes) are oblivious to the password distribution. In particular, $\DALock$ maintains an extra ``hit count" in addition to ``strike count" for each user, which is based on (estimates of) the cumulative probability of {\em all} login attempts for that particular account. We empirically evaluate $\DALock$ with an extensive battery of simulations using real-world password datasets. In comparison with the traditional $K$-strikes mechanism, {our simulations indicate that} $\DALock$ offers a superior {simulated} security/usability trade-off. For example, in one of our simulations, we are able to reduce the success rate of an attacker to $0.05\%$ (compared to $1\%$ for the $3$-strikes mechanism) whilst simultaneously reducing the unwanted lockout rate for accounts that are not under attack to just $0.08\%$ (compared to $4\%$ for the $3$-strikes mechanism). 
    more » « less
  10. We formally introduce, define, and construct {\em memory-hard puzzles}. Intuitively, for a difficulty parameter $t$, a cryptographic puzzle is memory-hard if any parallel random access machine (PRAM) algorithm with ``small'' cumulative memory complexity ($\ll t^2$) cannot solve the puzzle; moreover, such puzzles should be both ``easy'' to generate and be solvable by a sequential RAM algorithm running in time $t$. Our definitions and constructions of memory-hard puzzles are in the standard model, assuming the existence of indistinguishability obfuscation (\iO) and one-way functions (OWFs), and additionally assuming the existence of a {\em memory-hard language}. Intuitively, a language is memory-hard if it is undecidable by any PRAM algorithm with ``small'' cumulative memory complexity, while a sequential RAM algorithm running in time $t$ can decide the language. Our definitions and constructions of memory-hard objects are the first such definitions and constructions in the standard model without relying on idealized assumptions (such as random oracles). We give two applications which highlight the utility of memory-hard puzzles. For our first application, we give a construction of a (one-time) {\em memory-hard function} (MHF) in the standard model, using memory-hard puzzles and additionally assuming \iO and OWFs. For our second application, we show any cryptographic puzzle (\eg, memory-hard, time-lock) can be used to construct {\em resource-bounded locally decodable codes} (LDCs) in the standard model, answering an open question of Blocki, Kulkarni, and Zhou (ITC 2020). Resource-bounded LDCs achieve better rate and locality than their classical counterparts under the assumption that the adversarial channel is resource bounded (e.g., a low-depth circuit). Prior constructions of MHFs and resource-bounded LDCs required idealized primitives like random oracles. 
    more » « less