Search code examples
c#javaperformancehashtheory

How do I determine if it's appropriate to cache a hashCode() result?


Given I have an immutable class of which a GetHashCode() function has been written, how do I know if it would be beneficial to cache the hash result, or in most cases is it even wise to do so?

Considering that the performance of a GetHashCode() calculation has been optimised for primitives and string values, is it even something I should bother considering?

A typical GetHashCode() of mine might look like the following:

//C#
public override int GetHashCode() {
    int hash = 13;
    hash = 13 * hash + IntValue;
    hash = 13 * hash + (StringValue1 == null ? 0 : StringValue1.GetHashCode());
    hash = 13 * hash + (StringValue2 == null ? 0 : StringValue2.GetHashCode());
    return hash;
}

My thoughts on the matter of situations where it might be wise are:

  1. If it is intended to be the key of a map or dictionary.
  2. If the said map will have many lookups within its lifetime.

Solution

  • Your point "1" merely defines when you should implement GetHashCode() (and a matching Equals) - and in such scenarios you should ("2") expect it to be queried a moderate number of times. However, the key here is profiling, or a pre-existing knowledge of the scenario. For example, if your hash is actually taking a hash over a large-ish inner array then it is probably worth caching. In such cases, I would cache it lazily (perhaps as an int?) unless I know it is going to be used as a key (always), in which case I might pre-calculate it eagerly.

    In most cases, though, just calculate it on-demand each time.