I was just messing around with some code on python and realized it was not that difficult to the find out what a password is if you have the md5 (basically a brute force attack, imputing md5, going through millions of passwords, converting them to md5 and checking if it matches, and then outputting the password) but what is difficult is getting the md5. I did some digging and all i found was some videos of people using randomly generated md5 hashes of passwords and then finding out what password it corresponded to. What i was wondering was if there was any way you could find the md5 hash of a password without having the original password. Thx
-If anything is unclear just tell me in the comments and i will clean it up
You're correct that you can brute-force an md5 hash to retrieve the original password, provided that the original password and the brute-force attempts hash to the same value. To compensate for this, often password systems use what's known as "salt" to make this significantly more difficult. (See also: What is SALT and how do i use it?)
The answer to your question, in general, is no, there is no easy way to obtain the hash of some value without having that value first.
Originally, hashing algorithms were designed to take some input and manipulate it so the output of the algorithm can be used as an index into a table of values. The goal is to have a 1:1 hash (ideally that's extremely fast, hopefully constant time). This means that given some input value x
, y = hash(x)
should be such that ONLY x
hashes to y
. In other words, y1 = hash(x1) = hash(x)
if, and only if, x1 = x
.
As time went on, algorithms were developed that had other properties. Since it became common for hashing algorithms to be used to things like password storage and quick comparison, one of the things that was valued with a hashing algorithm is how small changes to the input should lead to differences in the output. In other words, the hashing function hash(x)
should change if x
changes entirely, (as in the case of not(x)
), or if it changes by a single bit.
One corollary is that, if hash(x)
changes significantly when you change a single bit (as in the case of hash(x+0x000001)
, then it makes the comparison function much faster (since you really only need to check the higher order bits to determine if two objects are the same, in the average case). This means that you can't easily compute the hash of sequential items simply by iterating through hashes (i.e. "guessing" the hash output of the function hash(x)
without first having x
).