Featured Post

Có một Biển Đông trên không gian mạng

Có một Biển Đông trên không gian mạng Thái Dương Mùa hè 2014, giữa lúc người Việt trong nước và hải ngoại đang sôi sục vì Trung Quốc đư...

Tuesday, April 22, 2014

Friday, April 18, 2014

Fairy tales in password hashing with scrypt

Update: the author of scrypt said that he's added a warning on top of scryptenc.h.

TL;DR: scrypt is a password-based key derivation function that is often used as a password hashing scheme. libscrypt is considered the official implementation of scrypt, but it's actually a file encryption API with the scrypt function never exposed directly. If one misuses libscrypt for password hashing they may end up with extremely weak hashes.

Password-based key derivation functions (PBKDF) are algorithms that take a password and some non-secret data (e.g., salt), and output one or more secret keys that can be further used in other crypto constructions. These functions are generally made deliberately slow so as to frustrate brute-force attack or dictionary attack on the input password.

scrypt is considered a state-of-the-art PBKDF, but its most common use is as a password hashing scheme, even though it wasn't originally designed for this purpose. The author of scrypt has never released a standalone library for the scrypt function; he's released only a utility, confusingly named scrypt, that uses the scrypt function to implement an encryption API.

This makes a lot of senses, as scrypt was originally built for Tarsnap, the online file backup service built by the same person. It's also, however, a crypto disaster waiting to happen: if one uses the file encryption API as a password hashing scheme they may end up with extremely weak hashes. Since the introduction of scrypt I've seen a few of such misuses, all of which were made by otherwise competent programmers.

The scrypt encryption API is declared as follows:

 * scryptenc_buf(inbuf, inbuflen, outbuf, passwd, passwdlen,
 *     maxmem, maxmemfrac, maxtime):
 * Encrypt inbuflen bytes from inbuf, writing the resulting inbuflen + 128
 * bytes to outbuf.
int scryptenc_buf(const uint8_t *, size_t, uint8_t *,
    const uint8_t *, size_t, size_t, double, double);

The intended usage of this function is to derive a key from the password using scrypt (with a randomly generated salt,) then use the derived key to encrypt the input buffer. I've found that many people that want to use scrypt as a password hashing function end up using scryptenc_buf.

In the first case the programmer used scryptenc_buf to encrypt an empty string using the input password as the key, and saved the output as the password hash. There's no badness: although the hash isn't compatible with scrypt anymore, it still has the same strength. It went downhill very fast from here though.

The second programmer used the same function, but she encrypted the input password with a static key. Since scryptenc_buf generates a random salt for each invocation, each password is probably encrypted with a unique key (derived from the random salt and the static key,) but it's still pretty bad: if anyone obtains the static key they can recover all passwords from their hashes. The developer knew that encrypting password is a bad idea, but she was confused by the API. After all the hashes looked random. As the saying goes, bad crypto is usually indistinguishable from good crypto.

The third programmer, for some reason that I've forgotten, wanted to use a single salt for all users, so he modified scryptenc_buf as follows:

scryptenc_buf(const uint8_t * inbuf, size_t inbuflen, uint8_t * outbuf, const uint8_t * passwd,
    size_t passwdlen, const uint8_t * salt, size_t saltlen,
    size_t maxmem, double maxmemfrac, double maxtime);

Like the second programmer, he encrypted the input password with a static key. With a static salt, all passwords are encrypted under one single key. The encryption algorithm used in scryptenc_buf is AES in counter mode with a zero IV. Did you spot the vulnerability? It's the classic keystream reuse: knowing a single password leads to the recovery of all passwords from their hashes. If one can register as a user they can recover all passwords. How comes hashing password introduces such a deadly vulnerability? I run out of people to blame.

The last case was a group of Java programmers. One of them wrote a JNI wrapper on top of libscrypt. The wrapper accepted a memcost parameter, which should be the same as \(N\) in the original scrypt paper, but somehow its author wanted it to be \(\log{N}\). When another programmer called this function he passed, however, \(N\), so the actual memcost parameter became super big. This mismatch should be caught easily, as libscrypt would return an error code if it couldn't allocate the required memory. Checking return value for errors seems not to be, however, a popular pattern in the Java world.

As a result no password was hashed, and all stored hashes were a series of zero bytes. Anyone could sign in to anybody else's accounts using any passwords. Fortunately, they brought my consulting team in very early in the development process, and I found the bug before they had any real users. Moral of the story? Always look at the output of your crypto.

If I develop a crypto library, I'll conduct user studies like how they do it in usability research. Give developers the library and ask them to conduct a specific task. Rinse and repeat until nobody would misuse it.

Tuesday, April 15, 2014

Obama is a liar

Theo Google, một công ty của Mỹ, thì đương kim tổng thống Mỹ là một thằng nói láo, ngu dốt, theo đạo Hồi và cả cộng sản. Mà Google chỉ là dựa vào gợi ý của người dùng Internet thôi nhé: đây là những cụm từ xuất hiện nhiều nhất khi người ta muốn mô tả Obama. Không tin thì bạn thử tìm đi. Thấy chưa? Thấy dân Mỹ chửi Obama như chửi con chưa? Thế mà chẳng thấy ai bị bắt gì cả. Lăng mạ như thế này thì còn thể thống gì nữa. Tự do mà không có khuôn khổ thế này thì loạn mất.

Thế mà nước Mỹ vẫn ổn, vẫn giàu mạnh. Một trăm năm mươi năm nay chuyển giao quyền lực không hề đổ một giọt máu, chỉ có tốn nước miếng thôi.

Trong lúc đó... có một ông tên là Trương Duy Nhất viết blog nói về các "Obama của Việt Nam", chưa dám chửi như chửi con, cũng chẳng dám mắng như chủ mắng đầy tớ, chỉ mới nói nhẹ vài tiếng, nhưng cũng đã được tặng vài cuốn lịch xé chơi qua ngày rồi.

Thử tưởng tượng chúng ta có một Edward Snowden của Việt Nam. Anh ấy vừa chôm được một đống tài liệu bí mật, phải làm gì tiếp theo bây giờ? Chẳng có tờ báo trong nước nào dám đăng. Gửi cho báo nước ngoài thì chính phủ sẽ chặn những tờ báo đó, mà dẫu không chặn thì cũng chẳng mấy ai biết mà tìm đọc. Chắc chỉ còn nước cân ký bán hoặc xé chùi đít dần.

Trong khi đó... Washington Post, tờ báo ở sát nhà Obama, đăng liền tù tì những tài liệu do Snowden gửi đến, rồi đoạt luôn giải Pulitzer, giải thưởng báo chí của chính nước Mỹ. Nghe mỉa mai cứ như là báo Nhân Dân được trao giải báo chí quốc gia vì đã phanh phui những phi vụ đàn áp tự do của chính phủ chúng mình.

Tự do ngôn luận là vậy đó. Rất đơn giản, đâu có gì khó hiểu đâu: tôi được quyền nói, anh được quyền nói, ai cũng được quyền nói và không một ai có quyền kiểm duyệt.

Friday, April 11, 2014

An idea to solve the CloudFlare Heartbleed challenge

tl;dr: list of values that if any of them is leaked one can recover the RSA private key: \(p\), \(q\), \(d \mod{(p-1)}\), \(d \mod{(q-1)}\), \(c^d \mod q\), and \(c^d \mod p\), where \(c\ = m^e\) and \(m\) is the premaster secret sent by the client.

I plan to take a look at the challenge this weekend, but it seems that someone has solved it in just a few hours. Very nice work. Anyway here's how I think this challenge could be solved:
1. Exploit the fact that if you know any intermediate values in the Chinese Remainder algorithm, you can recover the private key.
2. Search for these intermediate values in the leaked memory.

The private key will be used in two cases:
a) To decrypt the proposed premaster secret from the client.
b) To sign an ephemeral (EC)DH public key.

In both cases you know the value \(c\) and \(m\) in the equation \(c^d =  m \pmod N\), where \(d\) is the private key, and \(N\) is the modulus. To make it faster \(c^d \pmod N\) is carried out using Chinese Remainder algorithm in which there are two intermediate values \(x_1\) and \(x_2\) that are calculated as follows:
\(c^d = c^{d\mod{q-1}} = x_1 \pmod q\)
\(c^d = c^{d\mod{p-1}} = x_2 \pmod p\)

Note that \((m - x_1)\) is a multiple of \(q\). So you can use \(\gcd(m - x_1, N)\) to obtain \(q\), if you happen to know \(x_1\). Searching for \(x_1\) looks feasible because it's a bignum half the size of the modulus.

Warning: I haven't verified that this actually works. I just come up with the idea while taking a shower.

Update: I've taken a look at the code, and I think this idea should work. OpenSSL indeed uses Chinese Remainder.

static int RSA_eay_mod_exp(BIGNUM *r0, const BIGNUM *I, RSA *rsa, BN_CTX *ctx)
BIGNUM *r1,*m1,*vrfy;
BIGNUM local_dmp1,local_dmq1,local_c,local_r1;
BIGNUM *dmp1,*dmq1,*c,*pr1;
int ret=0;

r1 = BN_CTX_get(ctx);
m1 = BN_CTX_get(ctx);
vrfy = BN_CTX_get(ctx);

r1 and m1 are the intermediate values, and they seem to be allocated for every invocation of this function. Their bignum_st structures are re-used from a pool, but the actual values are freshly allocated from the heap, unless I miss something. This is important, because we want these values sitting near the Heartbeat request. Note that if you read this function you may get the impression that p and q are also freshly allocated, but in fact they aren't.

Update: someone pointed out that r1 and m1 will be scrubbed before Heartbeat requests are processed. Too bad :(.

Update: a simple exploit that simply searches for \(p\) in the leaked memory: https://gist.github.com/epixoip/10570627.

Update: you need to protect pre-calculated CRT parameters stored in the private key as well. See http://lekkertech.net/akamai.txt.

Besides \(d\), \(p\), and \(q\) inside the encoded private key there are also two pre-calcuted CRT parameters: \(d_p = d \pmod{p - 1}\) and \(d_q = d \pmod{q - 1}\). If you know \(d_p\) you can recover the private key as follows:
d_p & = d \pmod{p - 1} \\
\rightarrow ed_p & = ed \pmod{p-1} \\
\rightarrow ed_p & = 1 \pmod{p-1}\\
That means \(p - 1\) is a divisor of \(ed_p - 1\).