Search code examples
javacollectionshashmap

Confusion around Java HashMap bucketing logic


I was going through the putVal method implementation in the Java HashMap class. Below is the code for reference.

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);
                    break;
                }
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        if (e != null) { // existing mapping for key
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}

See this condition in above code

if ((p = tab[i = (n - 1) & hash]) == null)
    tab[i] = newNode(hash, key, value, null);

Why is there i = (n - 1) & hash in if condition? Shouldn't it be something like i = hash % n since it would give you position in array? What I understand is that you calculate hash, then take mod with size of array to calculate desired position. Am I missing something?


Solution

  • One might think that the two are identical, since n is guaranteed to be a power of 2.

    Example: Say n is 8. 8 in binary is 0b1000. To calculate the remainder of hash modulo 8, all we need to do is take the last three digits of hash, which is hash & 0b111. 0b111 is 7, aka n-1.

    BUT: hash % n can be negative, if hash is negative! Which is certainly not what we want, if we want to use the result as an array index. So the reason why they didn't use your proposal is that it doesn't work. (See e.g. this question)