Search code examples
hashcryptographygenetic-algorithmevolutionary-algorithm

Do hash functions contradict the founding assumptions of Evolutionary Algorithms?


  1. Evolutionary Algorithms use a fitness function to select candidates for survival across generations ("survival of the fittest"). I believe all fitness functions assume that the closer the candidate's value is to the desired value, the closer their input ("key") must be to the desired input.

  2. Cryptographic Hash Functions have the property that "it is infeasible to generate a message that has a given hash". I understand this to mean that there is little or no correlation between the "closeness" of values to the "closeness" of keys.

Putting these two together, doesn't that imply that the "survival of the fittest" assumption is wrong for Cryptographic Hash Functions? Meaning, if you wanted to use Evolutionary Algorithms to try to figure out the reverse of a Cryptographic Hash value, the fitness function would drive you in the wrong direction. Is the correlation between "closeness" of values and "closeness" of keys a prerequisite of Evolutionary Algorithms?


Solution

  • Yes, it's pretty much impossible to construct a fitness function that consistently tells you that value A is closer to the goal than value B based on the output of a (good) cryptographic hash function for all three. That follows from the property you mentioned. So evolutionary algorithms can't speed up reversing cryptographic hash functions for the average case. However, this shouldn't be a surprise: Said property is only useful in the first place because it breaks precisely the approach of evolutionary algorithms (speeding up reversal by looking at hash value similarity).

    Generalizing this, evolutionary algorithms (like all other algorithms that rely on a heuristic to guide their search, e.g. A*) are only useful if you can define a meaningful fitness function (heuristic). Obviously, it is possible to construct problems that don't allow this (e.g. by giving too little information), and it's completely propably that there are some more real-world applications that have the same problem. Evolutionary algorithms don't cure cancer, but again that's no surprise (nothing does, else we'd have moved to a different metaphor).

    On a side note, this fitness function doesn't have to be closeness to any particular value, there are many problems where the fitness can grow indefinitely, e.g. when optimizing code for performance the fitness could be the number of operations per second.