CodeGate 2009's Challenge 18 - Diffie-Hellman parameter tampering case study
1 Introduction
Last week I joined team CLGT to take part in the CodeGate 2009 organized by BeistLab. There's 21 challenges. This post is about challenge 18 which, IMHO, is one of the most interesting. You can download full report from CLGT here. There was only two teams could nail #18, and, unfortunately, we were not one of them. We were very close, just minutes away, from the final solution, but could not manage to solve it before the contest ended. Anyway, we're writing this writeup because we like it. This is a cryptography challenge. The objective is to decrypt the communication between a server and a client, which play a protocol involving RSA digital signature algorithm [1], Diffie-Hellman Key Protocol Agreement [2], and AES block cipher [3]. Section 2 describes the protocol and its setting in detail. Section 3 discusses some vulnerabilities of the protocol. Section 4 describes how we nail it. Section 5 discusses some ways to fix it. Section 6 concludes.2 The Protocol
As we said in the introduction, there's a client and server (in the challenge, they are port 9328 and 9000, respectively) who talk to each other using a protocol. The interesting part is they don't talk directly, but through a middle man who is, you guessed it, us.
The first thing we want to do is to understand the protocol, so in the beginning, we just pipe the data back and forth between the client and the server. As far as we know, the protocol includes four steps that consisting of seven numbers and two messages:
2.1 Step 1 – client sends his RSA public-key (e & n) to server
The client starts the protocol by sending two numbers to the server. At first, to be honest, we thought the first ten bytes number, which is a prime, is p in the DH protocol. After a very long try-and-error process, we concluded that this prime is in fact e in the client's RSA public-key. If it is e, then the second number, which is 38 bytes, should be n. We confirmed this theory in step 3.
The client starts the protocol by sending two numbers to the server. At first, to be honest, we thought the first ten bytes number, which is a prime, is p in the DH protocol. After a very long try-and-error process, we concluded that this prime is in fact e in the client's RSA public-key. If it is e, then the second number, which is 38 bytes, should be n. We confirmed this theory in step 3.
These two numbers are the only static numbers of the protocol, which makes sense since they are client's RSA public-key. All other numbers are somewhat random at first sight.
2.2 Step 2 – server sends DH's g, p, and A = g^x (mod p) to client
After receiving two numbers from the client in step 1, the server sends back three numbers. This step is where we were stuck. We knew that these 3 numbers are DH's parameters but didn't know which was which. The very reason is none of them was a prime! At first we thought they were encrypted by client's RSA public-key, so we tried to decrypt them, but still no prime appeared. Then somebody in #codegate suggested us looking at the value of each number. And until then, after wasting hours trying other theories, we recognized the obvious.
The second number was always greater than two other numbers. As g and A are both less than p, so we guessed the second number must be p. Of the two remained numbers, to know which was g and which was A, we tried to change them, and analyzed what we saw in the next step. Base on our analysis, which makes sense when you see what happened in step 3, we concluded that the first number is g, and the third number is A respectively.
To recap, in this step, the server sends DH's g, p, and A in that order to the client.
2.3 Step 3 – client sends B = g^y (mod p) and B's RSA signature to server
After receiving three numbers from the server in step 2, the client sends back two numbers. At first, we didn't know the role of each number. Once again after a very long try-and-error process, we saw they are related to each other. The second number is RSA signature of the first number which is signed by using the client's public-key seen in the first step.
So what is the first number? As you know, there are four public parameters in DH key agreement protocol, and we already saw three numbers in step 2, so this number should be B = g^y (mod p).
In summary, in this step, the client sends DH's B and B's RSA signature to the server.
2.4 Step 4 – server and client starts exchanging AES encrypted data
After receiving two numbers from the client in step 3, the server checks the RSA signature and returns a “SIGNATURE MISMATCH” message and terminates the protocol if it sees, you know, a modified B. Otherwise, the sever and the client starts exchanging data by sending one 16 bytes message back and forth.
The data is of course encrypted using a 128-bit block cipher which we guess is AES. What is the key? As you know, after completing step 3, the DH protocol is done. In other words, the client and server can both derive a shared secret which is A^y = B^x = g ^(xy) (mod p). They use this shared secret as the key to encrypt their data using AES.
In summary, the protocol consists of 4 steps which use RSA to verify data, DH to derive a key, and AES to encrypt messages.
3 Protocol Vulnerabilities
As you may already see, this protocol has several weaknesses. Below we show two major vulnerabilities which helps us the middle man to decrypt the messages exchanged between client and server.
3.1 Weak RSA public-key's n
The n part of client's RSA public-key is just 38 bytes = 304 bit which took us only seconds to factor it into the product of two primes:
26997494888422756793800618646853986321 = 5150852573609963923 * 5241364318354314827
As you know, the security of RSA is totally broken when n is factorizable like this, since we can easily compute the client private-key:
e = 3598711421, d = 15264128287770523976569188681594873497
Even if n is large enough, there's still another attack vector by replacing n with smaller numbers. Remember we are the trusted man in the middle!
3.2 Man-In-The-Middle Attacks against DH protocol
There's a lot of attacks against DH but the most powerful one is man-in-the-middle attack [5]. It has been known for years that an active attacker like us, capable of removing and adding messages, can easily break the core DH protocol. By intercepting A = g^x and B = g^y and replacing them with g^x' and g^y' respectively, we can fool client and server into thinking that they share a secret key. In fact, server will think that the secret key is g^xy' and client will believe that it is g^x'y which are both known by the attacker [4] . As a result, if we replace A with 1, and B with 1, then the shared secret will equal to 1.
By exploiting these two vulnerabilities, we can set the shared secret to whatever value we want, and therefore, decrypt subsequent messages.
4 How We Nail This Challenge
As we said in section 3, we nail this challenge by deploying a man-in-the-middle attack against the DH protocol.
In step 2 of the protocol, when the server sends its DH's g, p and A parameters to the client, we intercept and replace the third number with 1, then send the modified numbers to the client.
In step 3 of the protocol, when the client sends its DH's B parameter to the server, we intercept and replace it with 1. We can easily compute the RSA signature which is 1. Then we send the modified number and its signature to the server. Now we know the shared secret which, as stated section 3, equals to 1.
This shared secret is used as the key to encrypt subsequent messages using AES. But how? This is where we were stuck and lost 400 points :(. Since the shared secret is only one byte but AES requires a 16 bytes, we thought that we must do some calculation on it before actually use it to decrypt data.
At first, we tried to SHA1 the shared secret, and used the first 16 bytes output as AES key. This is the key derivation method recommended in DH Key Agreement Protocol's RFC [2]. But Alex told us not to do that. So we tried other methods, but, however, due to the lack of experience and time, we didn't get it until the contest was over. It was just 5 minutes :(. The solution is simply to zero padding the shared secret, to make it become something like '\x01' + '\x00' * 15, and uses that as the key to decrypt the messages. Run the attached Python script, and you'll see the messages in cleartext.
5 How To Fix The Protocol
The first obvious way to fix the protocol is...to forget it. Don't ever use it in production. Don't try to re-invent the wheel, there's existed secure protocols for the same purpose, for example TLS/SSL. But if we are forced to use this protocol, how should we modify it so that it becomes more secure?
The first thing we want to do is to ensure that RSA public-key's n number is large enough. After receiving n and e, the server must test to see if they meet the requirements of RSA cryptography standard [1]. The server must terminate the protocol if they don't. This will prevent somebody forging the signature in step 3.
For suggestion on how to use DH protocol securely, see [5].
6 Conclusion
At some point in the CodeGate2009, Alex sighed and said that “new school is weak at crypto”, and we agree. We definitely need to learn more. But, at least, by going this far, and writing such a detail report, we deserve some points, don't we?
7 Acknowledgement
Alex thank you so much for giving this gift to us. It was very nice of you to answer us many questions about this challenge. We're looking forward to seeing more cryptography challenges from you and BeistLab. Thanks somebody at #codegate for suggesting us to look at the value of each number in the step 2 of the protocol.
8 Reference
[1] RSA Cryptography Standard
ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-1/pkcs-1v2-1.pdf
[2] Diffie-Hellman Key Agreement Method
http://www.ietf.org/rfc/rfc2631.txt
[3] Advanced Encryption Standard (AES)
http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf
[4] Handbook of Applied Cryptography
http://www.cacr.math.uwaterloo.ca/hac/
[5] Security Issues in the Diffie-Hellman Key Agreement Protocol
http://crypto.cs.mcgill.ca/~stiglic/Papers/dhfull.pdf
Comments
Client gửi PU(n,e) cho server giả sử từ đầu.
Server gửi cho client có mã hóa thông tin gồm xác thực với MAC không?
Em thấy có rất nhiều giao thức dựa trên Diffie Hellman như EKE, SPEKE.
A Lê Hồng Phong alumni, 1979