Search code examples
hashf#equalitygethashcode

Using F#'s hash function inside GetHashCode() evil?


I encountered a couple of places online where code looked something like this:

[<CustomEquality;NoComparison>]
type Test =
    | Foo
    | Bar
    override x.Equals y = 
        match y with
        | :? Test as y' ->
            match y' with
            | Foo -> false
            | Bar -> true    // silly, I know, but not the question here
        | _ -> failwith "error"   // don't do this at home

    override x.GetHashCode() = hash x

But when I run the above in FSI, the prompt does not return when I either call hash foo on an instance of Test or when I call foo.GetHashCode() directly.

let foo = Test.Foo;;
hash foo;;   // no returning to the console until Ctrl-break
foo.GetHashCode();;  // no return

I couldn't readily proof it, but it suggests that hash x calls GetHashCode() on the object, which means the above code is dangerous. Or is it just FSI playing up?

I thought code like the above just means "please implement custom equality, but leave the hash function as default".

I have meanwhile implemented this pattern differently, but am still wondering whether I am correct in assuming that hash just calls GetHashCode(), leading to an eternal loop.


As an aside, using equality inside FSI returns immediately, suggesting that it either does not call GetHashCode() prior to comparison, or it does something else. Update: this makes sense as in the example above x.Equals does not call GetHashCode(), and the equality operator calls into Equals, not into GetHashCode().


Solution

  • It's not quite as simple as the hash function simply being a wrapper for GetHashCode but I can comfortably tell you that it's definitely not safe to use the implementation : override x.GetHashCode() = hash x.

    If you trace the hash function through, you end up here:

    let rec GenericHashParamObj (iec : System.Collections.IEqualityComparer) (x: obj) : int =
        match x with 
        | null -> 0 
        | (:? System.Array as a) -> 
            match a with 
            | :? (obj[]) as oa -> GenericHashObjArray iec oa 
            | :? (byte[]) as ba -> GenericHashByteArray ba 
            | :? (int[]) as ba -> GenericHashInt32Array ba 
            | :? (int64[]) as ba -> GenericHashInt64Array ba 
            | _ -> GenericHashArbArray iec a 
        | :? IStructuralEquatable as a ->    
            a.GetHashCode(iec)
        | _ -> 
            x.GetHashCode()
    

    You can see here that the wild-card case calls x.GetHashCode(), hence it's very possible to find yourself in an infinite recursion.

    The only case I can see where you might want to use hash inside an implementation of GetHashCode() would be when you are manually hashing some of an object's members to produce a hash code.

    There is a (very old) example of using hash inside GetHashCode() in this way in Don Syme's WebLog.


    By the way, that's not the only thing unsafe about the code you posted.

    Overrides for object.Equals absolutely must not throw exceptions. If the types do not match, they are to return false. This is clearly documented in System.Object.

    Implementations of Equals must not throw exceptions; they should always return a value. For example, if obj is null, the Equals method should return false instead of throwing an ArgumentNullException.

    (Source)