Search code examples
c#hashcodegethashcode

Why do string hash codes change for each execution in .NET?


Consider the following code:

Console.WriteLine("Hello, World!".GetHashCode());

First run:

139068974

Second run:

-263623806

Now consider the same thing written in Kotlin:

println("Hello, World!".hashCode())

First run:

1498789909

Second run:

1498789909

Why do hash codes for string change for every execution in .NET, but not on other runtimes like the JVM?


Solution

  • Why do hash codes for string change for every execution in .NET

    In short to prevent hash collision attacks. You can roughly find out the reason from the docs of the <UseRandomizedStringHashAlgorithm> configuration element:

    The string lookup in a hash table is typically an O(1) operation. However, when a large number of collisions occur, the lookup can become an O(n²) operation. You can use the configuration element to generate a random hashing algorithm per application domain, which in turn limits the number of potential collisions, particularly when the keys from which the hash codes are calculated are based on data input by users.

    but not on other runtimes like the JVM?

    Not exactly, for example Python's hash function is random. C# also produces identity hash in .net framework, core 1.0 and core 2.0 when <UseRandomizedStringHashAlgorithm> is not enabled.

    For Java maybe it's a historical issue because the arithmetic is public, and it's not good, read this.