Search code examples
javahashmap

Does a java hashmap bucket really contain a list?


Ive always been certain that a 'bucket' in a java hash map contains either a linked list or a Tree of some kind, indeed you can read in many places on the web how the bucket holds this list then iterates over the entries using the equals function to find entries that are stored in the same bucket (ie have the same key), bearing this in mind, can someone explain why the following, trivial code doesnt work as expected :-

private class MyString {
    String internalString;
        MyString(String string) {
            internalString = string;
        }
        @Override
        public int hashCode() {
            return internalString.length();  // rubbish hashcode but perfectly legal
        }
    }
...
Map<MyString, String> map = new HashMap<>();
        map.put(new MyString("key1"), "val1");
        map.put(new MyString("key2"), "val2"); 

        String retVal = map.get(new MyString("key1"));

        System.out.println("Val returned = "+retVal);

In this example I would have expected the two map entries to be in the list in the (same) bucket and for retVal to equal "val1", however it equals null? A quick debug shows why, the bucket does not contain a list at all just a single entry.....

I thought i was going mad until I read this on the baeldung website (https://www.baeldung.com/java-map-duplicate-keys) ...However, none of the existing Java core Map implementations allow a Map to handle multiple values for a single key.

What is going on, does a bucket in a hash map contain a list or not ?


Solution

  • Does a java hashmap bucket really contain a list?

    It depends.

    For older implementations (Java 7 and earlier), yes it really does contain list. (It is a singly linked list of an internal Node type.)

    For newer implementations (Java 8 and later), it can contain either a list or a binary tree, depending on how many entries hash to the particular bucket. If the number is small, a singly linked list is used. If the number is larger than a hard-coded threshold (8 in Java 8), then the HashMap converts the list to a balanced binary tree ... so that bucket searches are O(logN) instead of O(N). This mitigates the effects of a hash code function that generates a lot of collisions (or one where this can be made to occur by choosing keys in a particular way.)

    If you want to learn more about how HashMap works, read the source code. (It is well commented, and the comments explain the rationale as well as the nitty-gritty how does it work stuff. It is worse the time ... if you are interested in this kind of thing.)


    However, none of the existing Java core Map implementations allow a Map to handle multiple values for a single key.

    That is something else entirely. That is about multiple values for a key rather than multiple keys in a bucket.

    The article is correct. And this doesn't contradict my "a bucket contains a list or tree" statement.

    Put simply, a HashMap bucket can contain multiple key / value pairs, where the keys are all different.

    The only point on which I would fault the quoted text is that it seems to imply that it is implementations of Map that have the one value per key restriction. In reality, it is the Map API itself that imposes this restriction ... unless you use (say) a List as the map's value type.