Why you want to encrypt password hashes

Troy Hunt of Have I Been Pwned fame wrote that because encryption is reversible but hashing is not passwords should be hashed, not encrypted. I think passwords should hashed-then-encrypted, i.e., they are hashed with a slow memory-hard function like Argon, scrypt or bscrypt, then the hashes are encrypted using AES-SIV with a key residing in a remote oracle. I believe this offers the best security at a minimum cost, providing pretty good usability. It is not exactly a new idea. The remote key works as a secret pepperFacebook also uses a remote oracle in their password hashing scheme.

My first big project at Google was to modernize our password hashing scheme. I was having so much fun, spending days to design and implement a variant of scrypt in assembly (because I can and because it's actually faster than GCC intrinsics). In the end, I realized, however, it's a game that I can never win. I need a slow memory-hard hash function, but it can't be too slow because of usability. 

The scrypt paper recommends that "100 ms is a reasonable upper bound on the delay which should be cryptographically imposed on interactive logins". While in theory it's easy to let users wait 100ms, in practice it's hardly a pleasant user experience, especially when every few minutes large latency spikes in other parts might bump the waiting time to 500ms or even more.

Even if I were allowed to use 100ms and as much memory as I wanted, the game is still not in our favor. We want to stop two threats:

  • Mass password cracking. The attacker wants to crack as many passwords as possible.
  • Targeted password cracking. The attacker wants to crack only a handful of passwords.

Most user passwords, however, are weak, taking fewer than 2^40 guesses to crack. A slow hash function might partially mitigate the first threat, but it won't stop the second. In both cases, we're at the mercy of the adversary. They'll get more passwords if they spend more resources. At our scale even if they manage to get only 1% of the passwords, it'll affect tens of millions of users. Because the hashes can be cracked offline, there's nothing we can do to stop the attackers. Or can we?

The key insight I learned from my project is that I want a hash function that is fast on our hardware, but slow anywhere else. This is difficult to achieve though, because we use commodity, off-the-shelf hardware. This line of thought is, however, in the right direction. We want to "rig" the game, creating some kind of asymmetry in our favor. Taking it a step further, what we need is a function that is fast on our infrastructure. While attackers can buy whatever hardware we bought, reproducing our infrastructure is much more expensive.

This is the strategy driving the hash-then-encrypt design I mentioned at the beginning. In its simplest form, it works as follows:

  1. hashing the password using a moderately slow function;
  2. sending the hash to a remote oracle for encryption; the remote oracle deterministically encrypts the hash with a key that never leaves the oracle; and
  3. storing the result as the final password hash.

Round trip RPC requests within the same datacenter take 0.5ms, meaning encryption with the remote oracle is dirt cheap and yet very effective. Offline password cracking no longer works. The attackers must talk to the remote oracle, which can throttle brute-force attempts and alert SREs. They can go after the remote key, but that would generate even more noises and increase the chance that they'd be caught.

The deterministic encryption is required to enable key rotation. If the remote key is compromised, you can decrypt current hashes, and re-encrypt with a new key, without waiting for users to sign in. You can also use a PRF such as HMAC-SHA256, but key rotation would be difficult. You can also avoid sending the password hash to the remote oracle, and perform the encryption locally using a key derived from the remote key, but throttling would not be as effective. Deterministic encryption can also be replaced with probabilistic encryption, though the oracle must now support a decrypt method.

Comments

Unknown said…
Is there any change to replace password with other login mechanism?