For general hash table implementation:
Compute the hash of the key,
hash(key)=hashcode
Map the hashcode to the table/array.
hashcode % array_length = index
once we get the index, we add a node (the key, value, update next pointer) in the Linked List at that index.
So, the question is, what's the difference between:
def _get_index(self, key):
# compute the hashcode
hash_code = hash(key)
array_index = hash_code & 15 # FIXME : why?
return array_index
and
array_index = hash_code % 15
For example: for INPUT:
hm =MyHashMap()
hm.put("1", "sachin")
hm.put("2", "sehwag")
hm.put("3", "ganguly")
print(hm.get("1"))
print(hm.get("2"))
print(hm.get("3"))
OUTPUT:
sachin
sehwag
ganguly
'&' operator instead of '%' that doesn't make sense to me? because it doesn't work always as % operator in computing the index, But, I have seenn developer using &
in some implementations of Hashtable
Any suggestion?
array_index = hash_code & 15
is equivalent to (on positive values):
array_index = hash_code % 16
it only works in the case when all significant bits of the number are ones (which is when the number is of the form 2**n - 1
).
Both remove the highest part of the number bits.
Bit-masking is way faster than division. So it's used when possible to speed up computations. Every time you see:
b = a % modulo
with a > 0
and modulo
is a power of 2 (modulo == 2**n
), you can write:
b = a & (modulo-1)
instead. If the modulo is not a power of 2, then it cannot be done that way (and compiled languages optimizers often substitute power of 2 modulos or divisions/multiplications by the faster bit masking/shifting operation)
Even if it's true that bit-masking is way faster than division/modulus in assembly language, python is interpreted and that speed optimization is not really noticeable. Anyway if the intent is to mask bits, &
operator makes more sense.