Search code examples
javaalgorithmcollectionshashcode

How do I prove that Object.hashCode() can produce same hash code for two different objects in Java?


Had a discussion with an interviewer regarding internal implementation of Java Hashmaps and how it would behave if we override equals() but not the HashCode() method for an Employee<Emp_ID, Emp_Name> object.

I was told that hashCode for two different objects would never be the same for the default object.hashCode() implementation, unless we overrode the hashCode() ourselves.

From what I remembered, I told him that Java Hashcode contracts says that two different objects "may" have the same hashcode() not that it "must".

According to my interviewer, the default object.hashcode() never returns the same hashcode() for two different objects, Is this true?

Is it even remotely possible to write a code that demonstrates this. From what I understand, Object.hashcode() can produce 2^30 unique values, how does one produce a collision, with such low possibility of collision to demonstrate that two different objects can get the same hashcode() with the Object classes method.

Or is he right, with the default Object.HashCode() implementation, we will never have a collision i.e two different objects can never have the same HashCode. If so, why do so many java manuals don't explicitly say so.

How can I write some code to demonstrate this? Because on demonstrating this, I can also prove that a bucket in a hashmap can contain different HashCodes(I tried to show him the debugger where the hashMap was expanded but he told me that this is just logical Implementation and not the internal algo?)


Solution

  • 2^30 unique values sounds like a lot but the birthday problem means we don't need many objects to get a collision.

    The following program works for me in about a second and gives a collision between objects 196 and 121949. I suspect it will heavily depend on your system configuration, compiler version etc.

    As you can see from the implementation of the Hashable class, every one is guarenteed to be unique and yet there are still collisions.

    class HashCollider
    {
        static class Hashable
        {
            private static int curr_id = 0;
            public  final  int id;
    
            Hashable()
            {
                id = curr_id++;
            }
        }
    
        public static void main(String[] args)
        {
            final int NUM_OBJS = 200000; // birthday problem suggests
                                         // this will be plenty
    
            Hashable objs[] = new Hashable[NUM_OBJS];  
            for (int i = 0; i < NUM_OBJS; ++i) objs[i] = new Hashable();
    
            for (int i = 0; i < NUM_OBJS; ++i)
            {
                for (int j = i + 1; j < NUM_OBJS; ++j)
                {
                    if (objs[i].hashCode() == objs[j].hashCode())
                    {
                        System.out.println("Objects with IDs " + objs[i].id
                                         + " and " + objs[j].id + " collided.");
                        System.exit(0);
                    }
                }
            }
    
            System.out.println("No collision");
        }
    }