While creating a unit test for my C# code that works with WeakReferences
, i ran into some weird GC behavior - weird because i have not been able to come up with an explanation for it.
The issue stems from using the ?.
null conditional operator on an object that was gotten from my weak reference after the GC is meant to have collected it.
Here's minimal code that replicates it:
public class XYZClass
{
public string Name { get; set; }
}
public class Tests
{
public void NormalBehavior()
{
var @ref = new WeakReference<XYZClass>(new XYZClass { Name = "bleh" });
GC.Collect();
GC.WaitForPendingFinalizers();
XYZClass t;
@ref.TryGetTarget(out t);
Console.WriteLine(t == null); //outputs true
}
public void WeirdBehavior()
{
var @ref = new WeakReference<XYZClass>(new XYZClass { Name = "bleh" });
GC.Collect();
GC.WaitForPendingFinalizers();
XYZClass t;
@ref.TryGetTarget(out t);
Console.WriteLine(t == null); //outputs false
Console.WriteLine(t?.Name == null); //outputs false
}
}
The behavior wasn't exhibited when this code was run using linqpad. I also checked on the compiled IL code (using linqpad) and still couldn't recognize anything amiss.
This has nothing to do with the null conditional operator. You can easily see this by replacing it with normal member access:
Console.WriteLine(t == null); //outputs false
Console.WriteLine(t.Name == null); //outputs false
The original reference to the new XYZClass
object never goes "out of scope" in the debug build (and running under the debugger). Turn optimizations off in LINQPad, and you'll also see that t
is not null. But note that all of this is an implementation detail - depending on particulars of your system, you can get either result (for example, I get what you get on 32-bit Debug builds, but not 64-bit Debug builds).
The only guarantee you get as to managed object lifetime in .NET is that a strong reference outside of a finalizer will prevent an object from being collected. Forget all deterministic memory management - it just isn't there. A .NET implementation that has no garbage collector at all would be perfectly valid.
So let's have a look at the code being generated on my machine in particular. In the 64-bit build, t.Name == null
and t?.Name == null
have exactly the same results (though of course t.Name == null
will cause a NullReferenceException
instead of returning true). What about the 32-bit build?
The t.Name == null
part is substantially shorter:
00533111 mov ecx,dword ptr [ebp-44h] ; t
00533114 cmp dword ptr [ecx],ecx ; null check
00533116 call 00530D28 ; t.get_Name
0053311B mov dword ptr [ebp-54h],eax ; Name string
0053311E cmp dword ptr [ebp-54h],0 ; is null?
00533122 sete cl
00533125 movzx ecx,cl
00533128 call 708B09F4
You can see that we use two registers (ecx and eax), and two stack slots (-44h and -54h). What about the t?.Name == null
one?
001F3111 cmp dword ptr [ebp-44h],0 ; is t null?
001F3115 jne 001F311F
001F3117 nop
001F3118 xor edx,edx
001F311A mov dword ptr [ebp-54h],edx ; result is false
001F311D jmp 001F312A
001F311F mov ecx,dword ptr [ebp-44h] ; t
001F3122 call 001F0D28 ; t.get_Name
001F3127 mov dword ptr [ebp-54h],eax
001F312A cmp dword ptr [ebp-54h],0 ; is name null?
001F312E sete cl
001F3131 movzx ecx,cl
001F3134 call 708B09F4
001F3139 nop
We're still using the same two stack slots, but another register is required - edx. Could this be what we're looking for? You betcha! If we look at how the object is originally created:
001F30A0 mov ecx,2C0814h
001F30A5 call 001330F4 ; new XYZClass
001F30AA mov dword ptr [ebp-48h],eax ; tmp
001F30AD mov ecx,dword ptr [ebp-48h]
001F30B0 call 001F0D38 ; tmp.XYZClass()
001F30B5 mov edx,dword ptr ds:[36B230Ch] ; "bleh"
001F30BB mov ecx,dword ptr [ebp-48h]
001F30BE cmp dword ptr [ecx],ecx
001F30C0 call 001F0D30 ; tmp.set_Name("bleh")
001F30C5 nop
001F30C6 mov ecx,2C0858h
001F30CB call 710F9ECF ; new WeakReference
001F30D0 mov dword ptr [ebp-4Ch],eax
001F30D3 mov ecx,dword ptr [ebp-4Ch]
001F30D6 mov edx,dword ptr [ebp-48h] ; EDX references tmp!
001F30D9 call 709090B0
001F30DE mov eax,dword ptr [ebp-4Ch]
001F30E1 mov dword ptr [ebp-40h],eax
You can see that it so happens that the null-conditional version uses the same register that was used to hold the temporary reference to XYZClass
. And that's where the difference stems from - the runtime cannot rule out that the edx
access is a use of the temporary reference, so it plays it safe and keeps the object rooted, which prevents it from being collected.
The 64-bit version (and running without debugger attached) doesn't see the difference, because it reuses a different register - on my particular machine, the 64-bit version reuses rcx
(which holds a reference to the WeakReference
, not XYZClass
), and the non-debugger 32-bit version reuses eax
(which holds a reference to "bleh"
). Since edx
(and rdx
) are never used in the method, the temporary reference isn't rooted anymore, and is free to be collected.
Why does the debugger version use edx
in particular? Most likely, it's trying to be helpful. In the middle of the null conditional operator, you want to see the value of both t
and t?.Name
, so they better be accessible (you can see this in Locals as "XYZClass.Name.get returned "bleh" string").
Again, note that this is entirely implementation specific. The contract only specifies when an object must not be reclaimed - it doesn't say when it will be reclaimed.