Update: comments on Twitter.
Most users of Curve25519 believe that they don't have to validate public keys. I used to believe that too until I saw what could go wrong.
Recently I reviewed a protocol which could be simplified as follows
1. Alice and Bob perform the classic unauthenticated Diffie-Hellman dance. They obtain some shared keys at the end of this step.
2. Alice uses the shared keys to encrypt her data, and the resulting ciphertext is broadcast over radio (Bluetooth, WIFI, etc.) Since only Bob has the shared keys, nobody else could see the data.
This is more or less the same as SSL, except that the key exchange is unauthenticated. The designers have good reasons to believe that leaving the protocol unauthenticated poses a low risk only. They do their homework, and decide to use Curve25519, a state of the art Diffie-Hellman function. It turns out that this choice weakens the protocol.
If Mallory replaces Alice's and Bob's public keys with zero, which is a valid Curve25519 public key, he would be able to force the ECDH shared value to be zero, which is the encoding of the point at infinity, and thus get to dictate some publicly known values as the shared keys. It still requires an active man-in-the-middle attack to pull the trick, after which, however, not only Mallory can decode Alice's data, but everyone too! It is also impossible for Alice and Bob to detect the intrusion, as they still share the same keys, and can communicate with each other as normal.
If you haven't figured out what is going on yourself, the problem is that (0, 0) is a valid point on Curve25519, and has order of 2, i.e., (0, 0) + (0, 0) = the point at infinity. Curve25519 private keys are always even, thus the shared point would always be the point at infinity, whose x-coordinate is encoded as zero. If none of this makes any sense, I recommend buying a copy of An Introduction to Mathematical Cryptography.
Run this Sage script to see it yourself:
# the prime
P = 2^255 - 19
# the finite field over P
F = GF(P)
# Curve25519, see the paper
A = 486662
E = EllipticCurve(F, [0, A, 0, 1, 0])
# the point at infinity
INFINITY = (0, 1, 0)
# point (0, 0)
ZERO = E(0, 0)
# (0, 0) + (0, 0)
Q = ZERO + ZERO
# should output true
Q == INFINITY
OK, this sounds bad. How do I validate Curve25519 public keys? This is where the confusion begins. Let me first quote Curve25519 official answer:
How do I validate Curve25519 public keys?
Don't. The Curve25519 function was carefully designed to allow all 32-byte strings as Diffie-Hellman public keys [...]
There are some unusual non-Diffie-Hellman elliptic-curve protocols that need to ensure ``contributory'' behavior. In those protocols, you should reject the 32-byte strings that, in little-endian form, represent 0, 1, [...]
What is the contributory behavior? Should we care? It seems no, as long as we're using ECDH, but it recommends rejecting zero, which sounds like what we want to do to prevent the attack.
The answer is yes, you have to validate public keys if you are using a protocol that requires contributory behavior. This behavior, which sounds like a half-brother of the confused deputy, requires that none of the parties participating in the key exchange could dictate the shared value. But that is allowed in our example protocol! Exactly, now you see the root cause of the vulnerability. A better way to state the requirement is: you have to validate public keys if you are using a protocol that is vulnerable to key-dictating attacks.
You might think that I just made up a protocol to prove my point, but protocols requiring contributory behavior are not uncommon. For example, all versions of TLS <= 1.2 require this property, and as a result, are vulnerable to the triple handshake attack which uses a clever trick to dictate the shared value. I suspect that many homegrown protocols would be vulnerable too.
Whose fault is it? Many people, including me, consider Curve25519 the best practice when it comes to Diffie-Hellman. The problem is that, as a friend once said, in crypto there are also cases where the best practice is wrong, and the reasons are often subtle. This is one such case. I wouldn't blame the designers of Curve25519. Curve25519 was designed to do one job which is to compute the Diffie-Hellman function. It does its job well, but, like many powerful tools, it's, however, also a double-edged sword that could cause great harm when misused. Having said that, the attack would not work against NIST curves, but do not consider this as an endorsement of said curves -- they are inferior to Curve25519 in most aspects.
What is the solution?
Curve25519 suggests blacklisting the known bad public keys, but a better solution is to check the shared value and to raise exception if it is zero. You can also bind the exchanged public keys to the shared keys, i.e., instead of using H(abG) as the shared keys, you should use H(aG || bG || abG). If you exchange more than public keys -- TLS exchanges nonces, certs, and other info -- you should include them too. If you are implementing Curve25519 yourself, instead of encoding the point at infinity as zero, perhaps you should raise an exception. I cannot come up with any protocols which need to encode the point at infinity.