Search code examples
phpsecuritycryptographymd5black-box

Find $secret in $a == md5($b . $secret)


A function is:

$a == md5($b . $secret);
  • You can choose $a and $b
  • You don't know $secret
  • You get the value of the function for the $a and $b you choose as either true or false.

Is there any better attack than brute force to find the $secret in general? Is there any better attack than brute force to find the $secret using PHP's md5 function?

From what I've found on the web I think there is not, though md5 is deprecated for some other use cases. So just to be sure ...

Kind regards


Solution

  • If MD5 behaves like a random oracle (that's a big "if", see below), then exhaustive search on $secret is the best possible attack -- and, more importantly, each "guess" on the value of $secret MUST use a query to the function (since you use PHP, I assume that the function is implemented in a Web server, and each "query" requires talking to that server). The latter is implied by the lack of information sent back to the attacker: the attacker does not get anything except a single bit (the "True" or "False" result). In particular, he does not get the MD5 output itself. The attacker will get a long stream of uninformative "False" results unless he hits the correct MD5 output, either out of pure chance (probability 2-128, which is Really Darn Small), or because he properly guessed the value of $secret beforehand. It is worth noticing that this prevents the attacker from using many cost-sharing techniques, including precomputed tables, in particular the over-hyped rainbow tables.

    A random oracle is a mythical object which can be viewed as a deterministic black box: you know nothing on the output you will get from a given input, except that the box will always return the same result for a given input. A model is the following: the box contains a gnome, some dice and a big book. The gnome uses the dice to randomly choose the output. He also uses the book to keep track of the answers he already sent, so as to be consistent, i.e. if the input is identical to a previously submitted input, the gnome will return the same output than previously instead of throwing dice.

    MD5, however, is not a random oracle. For instance, we can build collisions for MD5 much faster than the theoretical 264 resistance for a function with a 128-bit output. Also, note that being a good hash function (collision-resistant and so on) does not absolutely require "random-oracleness". For instance, SHA-256 is considered to be a secure hash function, while it still suffers from the so-called "length extension attack" (given SHA256($a), one can compute SHA256($a . $b) without knowing $a, for almost arbitrary values of $b). So the guarantees of a random oracle do not hold with MD5 (or, for that matter, SHA-256). This does not mean that faster attacks are known ! Only that you are on your own here.

    One can also point out that md5($b . $secret) is a kind of "keyed hash", i.e. a MAC (Message Authentication Code). Building a MAC out of a hash function is not easy, precisely because of things like the length extension attack (md5($secret . $b), for instance, would be a very poor MAC). A robust way of building a MAC out of a hash function has been designed; it is called HMAC and involves two invocations of the underlying hash function (but one of them is on a short input, so this is efficient nonetheless). Security of HMAC, more precisely how HMAC can be considered to behave as a random oracle, can be "proven", i.e. reduced to some hash function internal properties which are believed true in the case of SHA-256 (see New Proofs for NMAC and HMAC: Security without Collision-Resistance by Mihir Bellare for the gory details). By using HMAC/SHA-256 over $b, with $secret as key, you would benefit from these security results, and your construction would be more convincingly robust. Then again, I do not claim that there is a known attack on md5($b . $secret), only that both the use of MD5 and the homemade MAC construction raise red flags, which degrade the level of trust which can be imparted to such a system.