Search code examples
javadictionaryhashcodecollision

Java Map::hashCode() collision - why?


The following code results in the same hash code being generated for the two maps, any ideas?


import java.util.HashMap;
import java.util.Map;

public class Foo
{
    @SuppressWarnings("unchecked")
    public static void main (String[] args)
    {
        Map map;

        map = new HashMap();

        map.put("campaignId", 4770L);
        map.put("location", "MINI_PROFILE");
        map.put("active", "true");
        map.put("lazy", true);

        System.out.println(map.hashCode());

        map = new HashMap();

        map.put("campaignId", 4936L);
        map.put("location", "MINI_PROFILE");
        map.put("active", "true");
        map.put("lazy", false);

        System.out.println(map.hashCode());


    }
}

The result is:

-1376467648
-1376467648

Simply changing the key names is enough to make the code generate two different hash codes.


Solution

  • Simply coincidence, I suspect... there are bound to be collisions, and in this case it looks like the relevant different bits in the first value are being lost, effectively.

    However, it shouldn't make any difference - anything using hash codes must cope with collisions.

    EDIT: It's just the way the hashes happen to be calculated. This code shows what's going on:

    import java.util.*;
    
    public class Test
    {
        @SuppressWarnings("unchecked")
        public static void main (String[] args)
        {
            AbstractMap.SimpleEntry[] entries = {
                new AbstractMap.SimpleEntry("campaignId", 4770L),
                new AbstractMap.SimpleEntry("campaignId", 4936L),
                new AbstractMap.SimpleEntry("lazy", true),
                new AbstractMap.SimpleEntry("lazy", false)
            };
            for (AbstractMap.SimpleEntry entry : entries) {
                System.out.println(entry + ": " + entry.hashCode());
            }
        }
    }
    

    Results:

    campaignId=4770: -1318251287
    campaignId=4936: -1318251261
    lazy=true: 3315643
    lazy=false: 3315617
    

    So in one pair the first map has a hash 26 less than the second map, and in another pair the first map has a hash 26 more than the second map.

    AbstractMap just sums hash values (one way of making sure that ordering is irrelevant) so the two end up with the same hash code.

    It's really down to Boolean.hashCode() which looks like this:

    return value ? 1231 : 1237;
    

    ... and Long.hashCode() which looks like this:

    return (int)(value ^ (value >>> 32));
    

    Given the values they happened to pick in Boolean.hashCode(), if your long values are only 26 apart (or 26 * 2^32 apart) then you'll run into the same thing.