Search code examples
c#garbage-collectionnull-conditional-operator

Weird C# GC behavior for Weakreferences and the null-conditional operator


While creating a unit test for my C# code that works with WeakReferences, i ran into some weird GC behavior - weird because i have not been able to come up with an explanation for it.

The issue stems from using the ?. null conditional operator on an object that was gotten from my weak reference after the GC is meant to have collected it.

Here's minimal code that replicates it:

    public class XYZClass
    {
        public string Name { get; set; }
    }

    public class Tests
    {
        public void NormalBehavior()
        {
            var @ref = new WeakReference<XYZClass>(new XYZClass { Name = "bleh" });

            GC.Collect();
            GC.WaitForPendingFinalizers();

            XYZClass t;
            @ref.TryGetTarget(out t);

            Console.WriteLine(t == null); //outputs true
        }

        public void WeirdBehavior()
        {
            var @ref = new WeakReference<XYZClass>(new XYZClass { Name = "bleh" });

            GC.Collect();
            GC.WaitForPendingFinalizers();

            XYZClass t;
            @ref.TryGetTarget(out t);

            Console.WriteLine(t == null); //outputs false
            Console.WriteLine(t?.Name == null); //outputs false
        }
    }

The behavior wasn't exhibited when this code was run using linqpad. I also checked on the compiled IL code (using linqpad) and still couldn't recognize anything amiss.


Solution

  • This has nothing to do with the null conditional operator. You can easily see this by replacing it with normal member access:

    Console.WriteLine(t == null); //outputs false
    Console.WriteLine(t.Name == null); //outputs false
    

    The original reference to the new XYZClass object never goes "out of scope" in the debug build (and running under the debugger). Turn optimizations off in LINQPad, and you'll also see that t is not null. But note that all of this is an implementation detail - depending on particulars of your system, you can get either result (for example, I get what you get on 32-bit Debug builds, but not 64-bit Debug builds).

    The only guarantee you get as to managed object lifetime in .NET is that a strong reference outside of a finalizer will prevent an object from being collected. Forget all deterministic memory management - it just isn't there. A .NET implementation that has no garbage collector at all would be perfectly valid.

    So let's have a look at the code being generated on my machine in particular. In the 64-bit build, t.Name == null and t?.Name == null have exactly the same results (though of course t.Name == null will cause a NullReferenceException instead of returning true). What about the 32-bit build?

    The t.Name == null part is substantially shorter:

    00533111  mov         ecx,dword ptr [ebp-44h]   ; t
    00533114  cmp         dword ptr [ecx],ecx  ; null check
    00533116  call        00530D28  ; t.get_Name
    0053311B  mov         dword ptr [ebp-54h],eax  ; Name string
    0053311E  cmp         dword ptr [ebp-54h],0  ; is null?
    00533122  sete        cl  
    00533125  movzx       ecx,cl  
    00533128  call        708B09F4  
    

    You can see that we use two registers (ecx and eax), and two stack slots (-44h and -54h). What about the t?.Name == null one?

    001F3111  cmp         dword ptr [ebp-44h],0   ; is t null?
    001F3115  jne         001F311F  
    001F3117  nop  
    001F3118  xor         edx,edx  
    001F311A  mov         dword ptr [ebp-54h],edx  ; result is false
    001F311D  jmp         001F312A  
    001F311F  mov         ecx,dword ptr [ebp-44h]  ; t
    001F3122  call        001F0D28                 ; t.get_Name
    001F3127  mov         dword ptr [ebp-54h],eax  
    001F312A  cmp         dword ptr [ebp-54h],0    ; is name null?
    001F312E  sete        cl  
    001F3131  movzx       ecx,cl  
    001F3134  call        708B09F4  
    001F3139  nop  
    

    We're still using the same two stack slots, but another register is required - edx. Could this be what we're looking for? You betcha! If we look at how the object is originally created:

    001F30A0  mov         ecx,2C0814h  
    001F30A5  call        001330F4  ; new XYZClass
    001F30AA  mov         dword ptr [ebp-48h],eax  ; tmp
    001F30AD  mov         ecx,dword ptr [ebp-48h]  
    001F30B0  call        001F0D38  ; tmp.XYZClass()
    001F30B5  mov         edx,dword ptr ds:[36B230Ch]  ; "bleh"
    001F30BB  mov         ecx,dword ptr [ebp-48h]  
    001F30BE  cmp         dword ptr [ecx],ecx  
    001F30C0  call        001F0D30  ; tmp.set_Name("bleh")
    001F30C5  nop  
    001F30C6  mov         ecx,2C0858h  
    001F30CB  call        710F9ECF  ; new WeakReference
    001F30D0  mov         dword ptr [ebp-4Ch],eax  
    001F30D3  mov         ecx,dword ptr [ebp-4Ch]  
    001F30D6  mov         edx,dword ptr [ebp-48h]  ; EDX references tmp!
    001F30D9  call        709090B0  
    001F30DE  mov         eax,dword ptr [ebp-4Ch]  
    001F30E1  mov         dword ptr [ebp-40h],eax  
    

    You can see that it so happens that the null-conditional version uses the same register that was used to hold the temporary reference to XYZClass. And that's where the difference stems from - the runtime cannot rule out that the edx access is a use of the temporary reference, so it plays it safe and keeps the object rooted, which prevents it from being collected.

    The 64-bit version (and running without debugger attached) doesn't see the difference, because it reuses a different register - on my particular machine, the 64-bit version reuses rcx (which holds a reference to the WeakReference, not XYZClass), and the non-debugger 32-bit version reuses eax (which holds a reference to "bleh"). Since edx (and rdx) are never used in the method, the temporary reference isn't rooted anymore, and is free to be collected.

    Why does the debugger version use edx in particular? Most likely, it's trying to be helpful. In the middle of the null conditional operator, you want to see the value of both t and t?.Name, so they better be accessible (you can see this in Locals as "XYZClass.Name.get returned "bleh" string").

    Again, note that this is entirely implementation specific. The contract only specifies when an object must not be reclaimed - it doesn't say when it will be reclaimed.