I came across a very funny situation where comparing a nullable type to null inside a generic method is 234x slower than comparing an value type or a reference type. The code is as follows:
static bool IsNull<T>(T instance)
{
return instance == null;
}
The execution code is:
int? a = 0;
string b = "A";
int c = 0;
var watch = Stopwatch.StartNew();
for (int i = 0; i < 1000000; i++)
{
var r1 = IsNull(a);
}
Console.WriteLine(watch.Elapsed.ToString());
watch.Restart();
for (int i = 0; i < 1000000; i++)
{
var r2 = IsNull(b);
}
Console.WriteLine(watch.Elapsed.ToString());
watch.Restart();
for (int i = 0; i < 1000000; i++)
{
var r3 = IsNull(c);
}
watch.Stop();
Console.WriteLine(watch.Elapsed.ToString());
Console.ReadKey();
The output for the code above is:
00:00:00.1879827
00:00:00.0008779
00:00:00.0008532
As you can see, comparing an nullable int to null is 234x slower than comparing an int or a string. If I add a second overload with the right constraints, the results change dramatically:
static bool IsNull<T>(T? instance) where T : struct
{
return instance == null;
}
Now the results are:
00:00:00.0006040
00:00:00.0006017
00:00:00.0006014
Why is that? I didn't check the byte code because I'm not fluent on it, but even if the byte code was a little bit different, I would expect the JIT to optimize this, and it is not (I'm running with optimizations).
If you compare the IL produced by the two overloads, you can see that there is boxing involved:
The first looks like:
.method private hidebysig static bool IsNull<T>(!!T instance) cil managed
{
.maxstack 2
.locals init (
[0] bool CS$1$0000)
L_0000: nop
L_0001: ldarg.0
L_0002: box !!T
L_0007: ldnull
L_0008: ceq
L_000a: stloc.0
L_000b: br.s L_000d
L_000d: ldloc.0
L_000e: ret
}
While the second looks like:
.method private hidebysig static bool IsNull<valuetype ([mscorlib]System.ValueType) .ctor T>(valuetype [mscorlib]System.Nullable`1<!!T> instance) cil managed
{
.maxstack 2
.locals init (
[0] bool CS$1$0000)
L_0000: nop
L_0001: ldarga.s instance
L_0003: call instance bool [mscorlib]System.Nullable`1<!!T>::get_HasValue()
L_0008: ldc.i4.0
L_0009: ceq
L_000b: stloc.0
L_000c: br.s L_000e
L_000e: ldloc.0
L_000f: ret
}
In the second case, the compiler knows the type is a Nullable so it can optimize for that. In the first case, it has to handle any type, both reference and value types. So it has to jump through some extra hoops.
As for why int is faster than int?, I'd imagine there are some JIT optimizations involved there.