Search code examples
javastringhashcodehash-collision

Can hashcodes of short string be same?


I have short Strings (less than 10 characters). I will convert it into int and use it as primary key. (I can't use String primary key because of small problems.) I know the hashcodes of Strings of infinite length can collide, but can short Strings collide too?


Solution

  • Absolutely yes. For example, Ea and FB are colliding strings, each only two characters in length! Example:

    public static final void main(String[] args) {
        System.out.println("Ea".hashCode() + " " + "FB".hashCode());
    }
    

    Prints 2236 2236.


    The Java String#hashCode function isn't really even close to random. It's really easy to generate collisions for short strings, and it doesn't fare much better on long strings.

    In general, even if you stuck to just 6 bits per character (ASCII letters and numbers, and a couple of symbols), you'd exceed the possible values of the 32-bit hash code with only a 6-character string -- that is, you would absolutely guarantee collisions among the 2^36 6-character 6-bit strings.