Search code examples
javascriptreverse-engineeringbrute-forcehash-function

Given this hash function, an expected output, and the length of the input string, how do I find the input string that returns the given result?


I have this hash function below.

I know that for an input string of length 8 I get a hash with the value of 16530092119764772

The input string can only consist of the characters "abcdefghijklmnop"

What is the best approach to find the input string?

Is there a way to break down the problem mathematically without relying on a brute-force approach to find the string?

Would a recursive solution overflow the stack?

function hash(str) {

  let g = 8;
  let charset = "abcdefghijklmnop";

  for(let i = 0; i < str.length; i++) {
    g = (g * 82 + charset.indexOf(str[i]));
  }

  return g;

}

As an example for the string "agile" it hashes to 29662550362


Solution

  • That’s not even really a hash, because charset doesn’t have 82 characters in it. It’s more like parsing a string as a base-82 number where you can only use the first 16 symbols. It’d be completely reversible if it didn’t use floating-point numbers, which are imprecise for integers that big. In case you’re not familiar with why, the simplified version is that the operation inside the loop:

    g * 82 + d
    

    gives a different result for every possible value of g and d as long as d is less than 82, because there’s enough space between g * 82 and (g + 1) * 82 to fit 82 different ds (from 0 to 81). Each different result is reversible back to g and d by dividing by 82; the whole value is g and the remainder is d. When every operation inside the loop is reversible, you can reverse the whole thing.

    So, like you might convert a number to decimal manually with a loop that divides out one digit at a time, you can convert this imprecise number into base 82:

    const getDigits = (value, base) => {
        const result = [];
      
        while (value) {
            result.push(value % base);
            value /= base;
        }
      
        return result.reverse();
    };
    
    const getLetter = index =>
        String.fromCharCode(97 + index);
    
    const getPreimage = value =>
        getDigits(value, 82n)
            .map(Number)
            .map(getLetter)
            .join('');
    
    console.log(getPreimage(29662550362n));
    console.log(getPreimage(16530092119764772n));

    The results start with “i” because g starts at 8 instead of 0. The second number is also big enough to not be unique (in contrast to agile’s “hash”, which can be represented exactly by a JavaScript number), but if you were just trying to find any preimage, it’s good enough.

    function hash(str) {
    
      let g = 8;
      let charset = "abcdefghijklmnop";
    
      for(let i = 0; i < str.length; i++) {
        g = (g * 82 + charset.indexOf(str[i]));
      }
    
      return g;
    
    }
    
    for (const s of ['hijackec', 'hijacked', 'hijackee', 'hijackef', 'hijackeg']) {
        console.log(s, hash(s) === 16530092119764772);
    }