Search code examples
c#genericstype-parametertype-constraintsiequatable

When to specify constraint `T : IEquatable<T>` even though it is not strictly required?


In short, I am looking for guidance on which of the following two methods should be preferred (and why):

static IEnumerable<T> DistinctA<T>(this IEnumerable<T> xs)
{
    return new HashSet<T>(xs);
}

static IEnumerable<T> DistinctB<T>(this IEnumerable<T> xs) where T : IEquatable<T>
{
    return new HashSet<T>(xs);
}
  • Argument in favour of DistinctA: Obviously, the constraint on T is not required, because HashSet<T> does not require it, and because instances of any T are guaranteed to be convertible to System.Object, which provides the same functionality as IEquatable<T> (namely the two methods Equals and GetHashCode). (While the non-generic methods will cause boxing with value types, that's not what I'm concerned about here.)

  • Argument in favour of DistinctB: The generic parameter constraint, while not strictly necessary, makes visible to callers that the method will compare instances of T, and is therefore a signal that Equals and GetHashCode should work correctly for T. (After all, defining a new type without explicitly implementing Equals and GetHashCode happens very easily, so the constraint might help catch some errors early.)

Question: Is there an established and documented best practice that recommends when to specify this particular constraint (T : IEquatable<T>), and when not to? And if not, is one of the above arguments flawed in any way? (In that case, I'd prefer well-thought-out arguments, not just personal opinions.)


Solution

  • Start by considering when it might matter which of the two mechanisms is used; I can think of only two:

    1. When the code is being translated to another language (either a subsequent version of C#, or a related language like Java, or a completly dissimilar language such as Haskell). In this case the second definition is clearly better by providing the translator, whether automated or manual, with more information.
    2. When a user unfamiliar with the code is reading it to learn how to invoke the method. Again, I believe the second is clearly better by providing more information readily to such a user.

    I cannot think of any circumstance in which the fist definition would be preferred, and where it actually matters beyond personal preference.

    Others thoughts?