Search code examples
javahashmaphashcode

How are the objects distributed across buckets in a HashMap?


What will happen if I return -1 in the hashcode method?

what will be the bucket location for negative hashcodes? Where entry of the map will be stored for the nagative hashcode?

Why doesn't it cause an IndexOutOfBoundsException?


Solution

  • I assume that the OP understands how HashMap works and the question is just about the technical details. Very often, people explain the process of distributing values across buckets to simply take a mod of the hash to determine the index of the bucket for an object.

    This becomes problematic, when you have a negative hash:

    (hash < 0 && n > 0 ) => hash % n < 0
    

    To answer this question about the implementation detail in Java, let's jump directly to the source code:

    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }
    

    The 'address' of an item is:

    tab[(n - 1) & hash]
    

    Where n is the number of buckets. This will always result in a number in range [0, n-1].

    According to the request for clarification in the comment section, let me explain how & operand work:

    Let's take n = 10 and hash = -1

           n = 00001010
        hash = 11111111
    n & hash = 00001010
    

    & is a bitwise and operator. It means that, for each bit it checks if it's on in both arguments.

    As n is always non-negative, it'll have some leading 0s in their IEEE-754 representation.

    It means, that the result of an & operation will have at least the same number of leading zeros and therefor is less or equal to n.