Search code examples
f#equality

IEquatable in F#, = operator performance and structural equality


I'm wondering in which cases equality tests in F# cause boxing, and whether there are cases in which overriding Equals and GetHashCode and implementing IEquatable<> is preferable to using the StructuralEqualityAttribute. If so, can it be done without reducing the performance of the = operator?

For simple structs holding one integer, I ran a loop that repeats the same equality check 1M times. I timed the loop using...

  • = with custom (type and value test) equality: about 110ms
  • = with structural equality: 20ms to 25ms
  • A custom == operator that redirects to IEquatable: 1ms to 3ms
  • A custom == operator that compares the values directly: 0ms (erased by optimizer)

From what I understand, the IEquatable<> interface can be used as a performance optimization to prevent boxing when checking for equality. This seems to be common in C#, but I can hardly find mentions of it in F#. Also, the F# compiler complains when trying to override the = operator for a given type.

The StructuralEquality attribute is documented in the MSDN to override only Equals and GetHashCode. Still, it does prevent the explicit implementation of IEquatable<>. However, the resulting type is incompatible with IEquatable<MyType>. This doesn't seem logical to me, should a structurally equated type not implement IEquatable<>?

There is a note on the performance of = in the F# specification (8.15.6.2 in the 3.0 spec), but I don't know what to make of it:

Note: In practice, fast (but semantically equivalent) code is emitted for direct calls to (=), compare, and hash for all base types, and faster paths are used for comparing most arrays

The definition of "base types" given before doesn't seem useful to read this note. Does this refer to basic types?

I'm confused. What is going on? What would a proper equality implementation look like, if the type might be used as a collection key or in a frequent equality test?


Solution

  • Here's what I've gathered based on my limited experience:

    However, the resulting type is incompatible with IEquatable<MyType>.

    This is incorrect, the resulting type does implement IEquatable<MyType>. You can verify in ILDasm. Example:

    [<StructuralEquality;StructuralComparison>]    
    type SomeType = {
        Value : int
    }
    
    let someTypeAsIEquatable = { Value = 3 } :> System.IEquatable<SomeType>
    someTypeAsIEquatable.Equals({Value = 3}) |> ignore // calls Equals(SomeType) directly
    

    Perhaps you are confused by the way F# doesn't do implicit upcasts like C#, so if you were to just do:

    { Value = 3 }.Equals({Value = 4})
    

    This would actually call Equals(obj) instead of the interface member, which is contrary to expectation coming from C#.

    I'm wondering in which cases equality tests in F# cause boxing

    Update April 26 2024: the bug described below was just fixed. Struct equality should no longer be boxing in the next version of F#. Hurray!

    Original text:

    One common and vexing case is for any struct defined in e.g. C# and implementing IEquatable<T>, for example:

    public struct Vector2f : IEquatable<Vector2f>
    

    or similarly, any struct defined in F# with a custom implementation of IEquatable<T>, for example:

    [<Struct;CustomEquality;NoComparison>]
    type MyVal =
        val X : int
        new(x) = { X = x }
        override this.Equals(yobj) =
            match yobj with
            | :? MyVal as y -> y.X = this.X
            | _ -> false
        interface System.IEquatable<MyVal> with
            member this.Equals(other) =
                other.X = this.X
    

    Comparing two instances of this struct with the = operator actually calls Equals(obj) instead of Equals(MyVal), causing boxing to occur on both values being compared (and then casting and unboxing). Note: I reported this as a bug on the Visualfsharp Github.

    And if you think casting to IEquatable<T> explicitely will help, well it will, but it's a boxing operation in itself. But at least you can save yourself one of the two boxings this way.

    I'm confused. What is going on? What would a proper equality implementation look like, if the type might be used as a collection key or in a frequent equality test?

    I'm as confused as you are. F# appears to be very GC-happy. Even the default behavior:

    [<Struct>]
    type MyVal =
        val X : int
        new(x) = { X = x }
    
    for i in 0 .. 1000000 do
          (MyVal(i) = MyVal(i + 1)) |> ignore;;
    Réel : 00:00:00.008, Processeur : 00:00:00.015, GC gén0: 4, gén1: 1, gén2: 0
    

    Still causes boxing and undue GC pressure! See below for a workaround.

    What if the type must be used as a key in e.g. a dictionary? Well if it's System.Collections.Generics.Dictionary you're fine, that doesn't use the F# equality operator. But any collection defined in F# that uses this operator will run into boxing issues apparently.

    I'm wondering (...) whether there are cases in which overriding Equals and GetHashCode and implementing IEquatable<> is preferable to using the StructuralEqualityAttribute.

    The point would be to define your own custom equality, in which case you use the CustomEqualityAttribute instead of the StructuralEqualityAttribute.

    If so, can it be done without reducing the performance of the = operator?

    Update: I suggest avoiding the default (=) and directly using IEquatable(T).Equals. You can define an inline operator for that, or you can even redefine (=) in terms of it. This does the right for almost all types in F#, and for the rest it won't compile so you won't run into subtle bugs. (Update 2022: I'm not sure this is a great idea.)

    Original: Starting with F# 4.0, you can do the following (thanks latkin):

    [<Struct>]
    type MyVal =
        val X : int
        new(x) = { X = x }
        static member op_Equality(this : MyVal, other : MyVal) =
            this.X = other.X
    
    module NonStructural =
        open NonStructuralComparison
        let test () =
            for i in 0 .. 10000000 do
                  (MyVal(i) = MyVal(i + 1)) |> ignore
    
    // Real: 00:00:00.003, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0
    NonStructural.test()
    

    The NonStructuralComparison module overrides the default = with a version that simply calls op_Equality. I would add NoEquality and NoComparison attributes to the struct just to make sure you don't accidentally use the poorly performing default =.