Search code examples
hash-collision

Chance of a hash collision


Apologies if this is a duplicate question; most of those I've found are over my head, so I may have missed the answer.

For a given hash, say MD5 (128 bits), what is the chance of a hash collision with 10^12 of them?

My maths is not great, I've come up with this equation (I think it's correct) but have no idea how to solve it:

Collision_Chance = 1 - (1 - (1 / 2^128) ) ^ (10^12)

I'm guessing it's somewhere around 10^-26, does this sound about right?

Thanks

Edit: I think my estimate is very wrong. See Birthday Paradox


Solution

  • What does your formula say for having 2^128 + 1 values? I believe it does not say that the collision probability is 1, so it cannot be right. actually, I know it is not – the correct formula is rather large and unwieldy, but there are good approximations using the exponential of a fraction. SO does not typeset formulas, so I won’t try and write the formulas down here.

    The best key word to search for is probably “birthday attack”.