I am attempting to hash and keep(the hash) an object
of type IEnumerable<anotherobject>
which has about a 1000 entries. I'll be generating another such object, but this time I'd like to check for any changes in the values of the entries using the hash codes of the two objects.
Basically, I was wondering if GetHashCode()
is apt for this, both from a performance perspective and reliability perspective.
If I have to override it, what would be a good way to do so, does it always depend on the type of anotherobject
and what Equals
means when comparing two anotherobject
s? Is there a generic way to do it? This concern is because my object can be quite big.
The return value of GetHashCode
is guaranteed to be the same for the same object only on the same execution of the application; it's not guaranteed to be that reliable if you're storing hash codes between application executions. See the MSDN documentation for System.Object.GetHashCode() for more information ("a different hash code can be returned [by GetHashCode] if the application is run again."). In fact, as of March 2016, hash codes are now documented to possibly differ between different processes and different application domains (even within the same process), see the warning box in the GetHashCode documentation.
The return value of GetHashCode alone should never be used to determine an object's equality. Calling Equals will also be necessary.
For guidance on implementing GetHashCode, see the documentation's Notes to Inheritors.
On the default implementation of GetHashCode:
The default implementation of the GetHashCode method does not guarantee unique return values for different objects. Furthermore, the .NET Framework does not guarantee the default implementation of the GetHashCode method, and the value it returns will be the same between different versions of the .NET Framework. Consequently, the default implementation of this method must not be used as a unique object identifier for hashing purposes.
(Note that this is different from, for example, Java's default implementation of hashCode()
, which is documented to try to return different values for different objects "as much as is reasonably practical".)
If you need a more stable hash function, therefore, you must use your own, and more importantly, document your hash function to ensure its stability and ensure that users can rely on its stability.
There are several choices here, like MurmurHash3, MD5, and others. The important thing here is to document which hash function you're using.