Search code examples
c#class-designboxing

Should I use struct or class?


I am in a classic design dilemma. I am writing a C# data structure for containing a value and measurement unit tuple (e.g. 7.0 millimeters) and I am wondering if I should use a reference type or a value type.

The benefits of a struct should be less heap action giving me better performance in expressions and less stress on the garbage collector. This would normally be my choice for a simple type like this, but there are drawbacks in this concrete case.

The tuple is part of a rather general analysis result framework where the results are presented in different ways in a WPF application depending on the type of the result value. This kind of weak typing is handled exceptionally well by WPF with all it's data templates, value converts and template selectors. The implication is that the value will undergo a lot of boxing / unboxing if my tuple is represented as a struct. In fact the use of the tuple in expressions will be minor to the use in boxing scenarios. To avoid all the boxing I consider declaring my type as a class. Another worry about a struct is that there could be pitfalls with two-way binding in WPF, since it would be easier to end up with copies of the tuples somewhere in the code rather than reference copies.

I also have some convenient operator overloading. I am able to compare say millimeters with centimeters without problems using overloaded comparison operators. However I don't like the idea of overloading == and != if my tuple is a class, since the convention is that == and != is ReferenceEquals for reference types (unlike System.String, which is another classic discussion). If == and != is overloaded, someone will write if (myValue == null) and get a nasty runtime exception when myValue one day turn out to be null.

Yet another aspect is that there is no clear way in C# (unlike in e.g. C++) to distinguish reference and value types in code usages, yet the semantics are very different. I worry that users of my tuple (if declared struct) assumes that the type is a class, since most custom data structures are and assumes reference semantics. That is another argument why one should prefer classes simply because thats what the user expects and there are no "." / "->" to tell them apart. In general I would almost always use a class unless my profiler tells me to use a struct, simply because class semantics is the most likely expected by fellow programmers and C# has only vague hints whether it is one thing or the other.

So my questions are:

What other considerations should I weigh in when deciding if I should go value or reference?

Can == / != overloading in a class be justified in any circumstances?

Programmers assume stuff. Most would probably assume that something called a "Point" is a value type. What would you assume if you read some code with a "UnitValue"?

What would you choose given my usage description?


Solution

  • The benefits of a struct should be less heap action giving me better performance in expressions and less stress on the garbage collector

    Given without any context, this is a vast--and dangerous--overgeneralization. A struct is not automatically eligible for the stack. A struct can be placed on the stack if (and only if) its lifetime and exposure does not extend outside of the function that's declaring it, it doesn't get boxed within that function, and probably a host of other criteria that don't come to mind immediately. This means that making it part of an lambda expression or delegate means that it's going to be stored on the heap anyway. The point is not to worry about it, because there's a 99.9% chance that your bottlenecks are somewhere else.

    As for operator overloading, there's nothing stopping you (either technically or philosophically) from overloading operators on your type. While you're technically correct in that equality comparisons between reference types are, by default, semantically equivalent to object.ReferenceEquals, this is not a be-all and end-all rule. There are two basic things to keep in mind about operator overloading:

    1.) (And this may be the most important from a practical perspective) Operators are not polymorphic. That is, you will only use operators defined on the types as they are referenced, not as they actually exist.

    For example, if I declare a type Foo that defines an overloaded equals operator that always returns true, then I do this:

    Foo foo1 = new Foo();
    Foo foo2 = new Foo();
    object obj1 = foo1;
    
    bool compare1 = foo1 == foo2; // true
    bool compare2 = foo1 == obj1; // false
    

    Even though obj1 is, in reality, an instance of Foo, the overloaded operator doesn't exist at the type hierarchy level that I'm referencing the instance stored in the obj1 reference, so it falls back to reference comparison.

    2.) Comparison operations should be deterministic. It should not be possible to compare the same two instances using the overloaded operator and be able to yield differing results. Practically, this sort of requirement usually results in the types being immutable (since being able to tell the difference between one or more values in a class yet getting true from an equals operator is rather counterintuitive), but fundamentally it just means that you should not be able to alter a state value within an instance that will alter the result of a comparison operation. If it makes sense in your scenario to be able to mutate some of the instance state information without having it affect the result of a comparison, then there's no reason you shouldn't. That's just a rare case.