Search code examples
phphashfnv

FNV 'flavors' and PHP implementation


I'm trying to integrate FNV hashing algorithm on a PHP-based project as part of a requirement to produce hashes for a variety of data (e.g. URLs, keywords).

I saw this implementation by Neven Boyanov. He mentioned that due to arithmetic limitations in PHP, he was forced to use bitwise-shifting and addition instead of multiplication. Is his implementation correct? My knowledge is somehow limited in this area of computer science so I can't verify it myself.

Another question that I have is about the different 'flavors' of FNV. I saw that it offers 32-bit, 64-bit, and 128-bit variants but using the above implemention I always get 8-character hex hashes (I convert the integer result to hex using dechex()).

Given the input "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Proin at libero mi, quis luctus massa.", I get the following hex results:

  • (32-bit offset) 5b15c0f2
  • (64-bit offset) 6ea33cb5

Why is this so? I'm expecting a 16-character hex result from the 64-bit FNV. Are the 'flavors' referring only to the kind of arithmetic operations and seeds that would be used and not to the length of the result? (i.e. if I say 64-bit FNV, the hashing function would use 64-bit operations and seed but the result would still be 32-bit)

A bit of enlightenment would be greatly appreciated :)


Solution

  • I wrote PHP FNV hash function long ago and it was for a particular purpose, so at that time the 32-bit implementation was sufficient.

    To answer your first question - the implementation was tested against other (C and C++) implementations by comparing the algorithm (code) and sample results. So for 32-bit results it works as it should.

    If you want to implement the 64-bit (or 128-bit) version yourself you should change first the FNV_offset_basis but also the expression on line 73 which currently is:

    $hash += ($hash<<1) + ($hash<<4) + ($hash<<7) + ($hash<<8) + ($hash<<24);
    

    ... this is equivalent of multiplying by the number 16777619 (FNV_prime_32) which in binary is 1000000000000000110010011 - broken down to this expression: 2^24 + 2^8 + 2^7 + 2^4 + 2^1 + 2^0.

    For 64-bit you should multiply by 1099511628211 - binary 10000000000000000000000000000000110110011 ... expression: 2^88 + 2^8 + 2^7 + 2^5 + 2^4 + 2^1 + 2^0.

    I don't know how the expression $hash << 88 will be handled by PHP but you should experiment yourself. On my PHP 5.2.x it did not work well for numbers greater than 31.

    Finally, you may need to modify the $hash = $hash & 0x0ffffffff; to remove some garbage from the result. I figured that out through experiments. For the 64-bit ot should be like $hash = $hash & 0x0ffffffffffffffff;. Verify if it works correctly with PHP.

    You can also use other PHP libraries for higher arithmetic precision. In my opinion using bitwise shifts is faster.

    In fact you can product FNV Hash for any number of bits.