skip to main content

Title: DAHash: Distribution Aware Tuning of Password Hashing Costs
An attacker who breaks into an authentication server and steals all of the cryptographic password hashes is able to mount an offline-brute force attack against each user’s password. Offline brute-force attacks against passwords are increasingly commonplace and the danger is amplified by the well documented human tendency to select low-entropy password and/or reuse these passwords across multiple accounts. Moderately hard password hashing functions are often deployed to help protect passwords against offline attacks by increasing the attacker’s guessing cost. However, there is a limit to how “hard” one can make the password hash function as authentication servers are resource constrained and must avoid introducing substantial authentication delay. Observing that there is a wide gap in the strength of passwords selected by different users we introduce DAHash (Distribution Aware Password Hashing) a novel mechanism which reduces the number of passwords that an attacker will crack. Our key insight is that a resource-constrained authentication server can dynamically tune the hardness parameters of a password hash function based on the (estimated) strength of the user’s password. We introduce a Stackelberg game to model the interaction between a defender (authentication server) and an offline attacker. Our model allows the defender to optimize the parameters of DAHash e.g., specify how much effort is spent in hashing weak/moderate/high strength passwords. We use several large scale password frequency datasets to empirically evaluate the effectiveness of our differentiated cost password hashing mechanism. We find that the defender who uses our mechanism can reduce the fraction of passwords that would be cracked by a rational offline attacker by up to 15%.  more » « less
Award ID(s):
1704587 1931443 1755708
Author(s) / Creator(s):
Borisov, N.
Date Published:
Journal Name:
International Conference on Financial Cryptography and Data Security (FC 2021)
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We introduce password strength signaling as a potential defense against password cracking. Recent breaches have exposed billions of user passwords to the dangerous threat of offline password cracking attacks. An offline attacker can quickly check millions (or sometimes billions/trillions) of password guesses by comparing a candidate password’s hash value with a stolen hash from a breached authentication server. The attacker is limited only by the resources he is willing to invest. We explore the feasibility of applying ideas from Bayesian Persuasion to password authentication. Our key idea is to have the authentication server store a (noisy) signal about the strength of each user password for an offline attacker to find. Surprisingly, we show that the noise distribution for the signal can often be tuned so that a rational (profit-maximizing) attacker will crack fewer passwords. The signaling scheme exploits the fact that password cracking is not a zero-sum game i.e., it is possible for an attacker to increase their profit in a way that also reduces the number of cracked passwords. Thus, a well-defined signaling strategy will encourage the attacker to reduce his guessing costs by cracking fewer passwords. We use an evolutionary algorithm to compute the optimal signaling scheme for the defender. We evaluate our mechanism on several password datasets and show that it can reduce the total number of cracked passwords by up to 12% (resp. 5%) of all users in defending against offline (resp. online) attacks. While the results of our empirical analysis are positive we stress that we view the current solution as a proof-of-concept as there are important societal concerns that would need to be considered before adopting our password strength signaling solution. 
    more » « less
  2. Galdi, C ; Jarecki, S. (Ed.)
    In the past decade billions of user passwords have been exposed to the dangerous threat of offline password cracking attacks. An offline attacker who has stolen the cryptographic hash of a user’s password can check as many password guesses as s/he likes limited only by the resources that s/he is willing to invest to crack the password. Pepper and key-stretching are two techniques that have been proposed to deter an offline attacker by increasing guessing costs. Pepper ensures that the cost of rejecting an incorrect password guess is higher than the (expected) cost of verifying a correct password guess. This is useful because most of the offline attacker’s guesses will be incorrect. Unfortunately, as we observe the traditional peppering defense seems to be incompatible with modern memory hard key-stretching algorithms such as Argon2 or Scrypt. We introduce an alternative to pepper which we call Cost-Asymmetric Memory Hard Password Authentication which benefits from the same cost-asymmetry as the classical peppering defense i.e., the cost of rejecting an incorrect password guess is larger than the expected cost to authenticate a correct password guess. When configured properly we prove that our mechanism can only reduce the percentage of user passwords that are cracked by a rational offline attacker whose goal is to maximize (expected) profit i.e., the total value of cracked passwords minus the total guessing costs. We evaluate the effectiveness of our mechanism on empirical password datasets against a rational offline attacker. Our empirical analysis shows that our mechanism can significantly reduce the percentage of user passwords that are cracked by a rational attacker by up to 10%. 
    more » « less
  3. In the past few years billions of user passwords have been exposed to the threat of offline cracking attempts. Such brute-force cracking attempts are increasingly dangerous as password cracking hardware continues to improve and as users continue to select low entropy passwords. Key-stretching techniques such as hash iteration and memory hard functions can help to mitigate the risk, but increased key-stretching effort necessarily increases authentication delay so this defense is fundamentally constrained by usability concerns. We introduce Just in Time Hashing (JIT), a client side key-stretching algorithm to protect user passwords against offline brute-force cracking attempts without increasing delay for the user. The basic idea is to exploit idle time while the user is typing in their password to perform extra key-stretching. As soon as the user types in the first character(s) of their password our algorithm immediately begins filling memory with hash values derived from the character(s) that the user has typed thus far. We conduct a user study to guide the development of JIT e.g. by determining how much extra key-stretching could be performed during idle cycles or how many consecutive deletions JIT may need to handle. Our security analysis demonstrates that JIT can substantially increase guessing costs over traditional key-stretching algorithms with equivalent (or less) authentication delay. Specifically an empirical evaluation using existing password datasets demonstrates that JIT increases guessing costs by nearly an order of magnitude in comparison to standard key-stretching techniques with comparable delay. We provide a proof-of-concept implementation of a Just in Time Hashing algorithm by modifying Argon2. 
    more » « less
  4. We develop an economic model of an offline password cracker which allows us to make quantitative predictions about the fraction of accounts that a rational password attacker would crack in the event of an authentication server breach. We apply our economic model to analyze recent massive password breaches at Yahoo!, Dropbox, LastPass and AshleyMadison. All four organizations were using key-stretching to protect user passwords. In fact, LastPass' use of PBKDF2-SHA256 with $10^5$ hash iterations exceeds 2017 NIST minimum recommendation by an order of magnitude. Nevertheless, our analysis paints a bleak picture: the adopted key-stretching levels provide insufficient protection for user passwords. In particular, we present strong evidence that most user passwords follow a Zipf's law distribution, and characterize the behavior of a rational attacker when user passwords are selected from a Zipf's law distribution. We show that there is a finite threshold which depends on the Zipf's law parameters that characterizes the behavior of a rational attacker --- if the value of a cracked password (normalized by the cost of computing the password hash function) exceeds this threshold then the adversary's optimal strategy is {\em always} to continue attacking until each user password has been cracked. In all cases (Yahoo!, Dropbox, LastPass and AshleyMadison) we find that the value of a cracked password almost certainly exceeds this threshold meaning that a rational attacker would crack all passwords that are selected from the Zipf's law distribution (i.e., most user passwords). This prediction holds even if we incorporate an aggressive model of diminishing returns for the attacker (e.g., the total value of $500$ million cracked passwords is less than $100$ times the total value of $5$ million passwords). On a positive note our analysis demonstrates that memory hard functions (MHFs) such as SCRYPT or Argon2i can significantly reduce the damage of an offline attack. In particular, we find that because MHFs substantially increase guessing costs a rational attacker will give up well before he cracks most user passwords and this prediction holds even if the attacker does not encounter diminishing returns for additional cracked passwords. Based on our analysis we advocate that password hashing standards should be updated to require the use of memory hard functions for password hashing and disallow the use of non-memory hard functions such as BCRYPT or PBKDF2. 
    more » « less
  5. null (Ed.)
    Memory hard functions (MHFs) are an important cryptographic primitive that are used to design egalitarian proofs of work and in the construction of moderately expensive key-derivation functions resistant to brute-force attacks. Broadly speaking, MHFs can be divided into two categories: data-dependent memory hard functions (dMHFs) and data-independent memory hard functions (iMHFs). iMHFs are resistant to certain side-channel attacks as the memory access pattern induced by the honest evaluation algorithm is independent of the potentially sensitive input e.g., password. While dMHFs are potentially vulnerable to side-channel attacks (the induced memory access pattern might leak useful information to a brute-force attacker), they can achieve higher cumulative memory complexity (CMC) in comparison than an iMHF. In particular, any iMHF that can be evaluated in N steps on a sequential machine has CMC at most 𝒪((N^2 log log N)/log N). By contrast, the dMHF scrypt achieves maximal CMC Ω(N^2) - though the CMC of scrypt would be reduced to just 𝒪(N) after a side-channel attack. In this paper, we introduce the notion of computationally data-independent memory hard functions (ciMHFs). Intuitively, we require that memory access pattern induced by the (randomized) ciMHF evaluation algorithm appears to be independent from the standpoint of a computationally bounded eavesdropping attacker - even if the attacker selects the initial input. We then ask whether it is possible to circumvent known upper bound for iMHFs and build a ciMHF with CMC Ω(N^2). Surprisingly, we answer the question in the affirmative when the ciMHF evaluation algorithm is executed on a two-tiered memory architecture (RAM/Cache). We introduce the notion of a k-restricted dynamic graph to quantify the continuum between unrestricted dMHFs (k=n) and iMHFs (k=1). For any ε > 0 we show how to construct a k-restricted dynamic graph with k=Ω(N^(1-ε)) that provably achieves maximum cumulative pebbling cost Ω(N^2). We can use k-restricted dynamic graphs to build a ciMHF provided that cache is large enough to hold k hash outputs and the dynamic graph satisfies a certain property that we call "amenable to shuffling". In particular, we prove that the induced memory access pattern is indistinguishable to a polynomial time attacker who can monitor the locations of read/write requests to RAM, but not cache. We also show that when k=o(N^(1/log log N)) , then any k-restricted graph with constant indegree has cumulative pebbling cost o(N^2). Our results almost completely characterize the spectrum of k-restricted dynamic graphs. 
    more » « less