Search code examples
c#.netgarbage-collectionclr

What is the priority of deleting(collect) objects in WeakReference by garbage collector?


I am new to C# and am currently learning about the Garbage Collector. Specifically I am now learning about the WeakReference class.

The example below is from MSDN's documentation on WeakReference.

using System;
using System.Collections.Generic;

public class Program
{
    public static void Main()
    {
        // Create the cache.
        int cacheSize = 50;
        Random r = new Random();
        Cache c = new Cache(cacheSize);

        string DataName = "";
        GC.Collect(0);

        // Randomly access objects in the cache.
        for (int i = 0; i < c.Count; i++) {
            int index = r.Next(c.Count);

            // Access the object by getting a property value.
            DataName = c[index].Name;
        }
        // Show results.
        double regenPercent = c.RegenerationCount/(double)c.Count;
        Console.WriteLine($"Cache size: {c.Count}, Regenerated: {regenPercent:P0}");
    }
}

public class Cache
{
    // Dictionary to contain the cache.
    static Dictionary<int, WeakReference> _cache;

    // Track the number of times an object is regenerated.
    int regenCount = 0;

    public Cache(int count)
    {
        _cache = new Dictionary<int, WeakReference>();

        // Add objects with a short weak reference to the cache.
       for (int i = 0; i < count; i++) {
            _cache.Add(i, new WeakReference(new Data(i), false));
        }
    }

    // Number of items in the cache.
    public int Count
    {
        get {  return _cache.Count; }
    }

    // Number of times an object needs to be regenerated.
    public int RegenerationCount
    {
        get { return regenCount; }
    }

    // Retrieve a data object from the cache.
    public Data this[int index]
    {
        get {
            Data d = _cache[index].Target as Data;
            if (d == null) {
                // If the object was reclaimed, generate a new one.
                Console.WriteLine("Regenerate object at {0}: Yes", index);
                d = new Data(index);
                _cache[index].Target = d;
                regenCount++;
            }
            else {
                // Object was obtained with the weak reference.
                Console.WriteLine("Regenerate object at {0}: No", index);
            }

            return d;
       }
    }
}

// This class creates byte arrays to simulate data.
public class Data
{
    private byte[] _data;
    private string _name;

    public Data(int size)
    {
        _data = new byte[size * 1024];
        _name = size.ToString();
    }

    // Simple property.
    public string Name
    {
        get { return _name; }
    }
}
// Example of the last lines of the output:
//
// ...
// Regenerate object at 36: Yes
// Regenerate object at 8: Yes
// Regenerate object at 21: Yes
// Regenerate object at 4: Yes
// Regenerate object at 38: No
// Regenerate object at 7: Yes
// Regenerate object at 2: Yes
// Regenerate object at 43: Yes
// Regenerate object at 38: No
// Cache size: 50, Regenerated: 94%

How does the Garbage Collector prioritize which WeakReference objects it is going to collect? Why did the GC choose to remove 94% of the objects from the cache and keep only 6%?


Solution

  • WeakReferences allow the references object to be garbage collected. And the garbage collector will collect all collectable objects in the generation it is collecting.

    My guess is that some of the object has been promoted to gen 1/2 when the call to GC.Collect(0) occur. If you replace it with GC.Collect(2) I would expect no objects to be left alive.

    WeakReference is not typically a great idea for caching. I avoid it, and use strong reference when I want to cache something. Usually in combination with some way to estimate memory usage to be able to set a upper memory limit. Memory is often plentiful nowdays, so reducing memory usage is of limited importance.

    There is also the related concept of memory pooling, this can be used to reduce gen 2 GCs when using large memory buffers.

    Edit:

    Simplified your example a bit:

        static void Main(string[] args)
        {
            for (int i = 0; i < 100; i++)
            {
                GC.TryStartNoGCRegion(10000000);
                var cache = new Cache(50);
                Console.Write("Objects alive Before: " + cache.CountObjectsAlive());
                GC.EndNoGCRegion();
                GC.Collect(2, GCCollectionMode.Forced);
                Console.WriteLine("\tAfter : " + cache.CountObjectsAlive());
            }
            Console.ReadKey();
        }
    
    public class Cache
    {
        // Dictionary to contain the cache.
        Dictionary<int, WeakReference> _cache = new Dictionary<int, WeakReference>();
    
        public Cache(int count)
        {
            // Add objects with a short weak reference to the cache.
            for (int i = 0; i < count; i++)
            {
                _cache.Add(i, new WeakReference(new byte[i * 1024], false));
            }
        }
        public int CountObjectsAlive() => _cache.Values.Count(v => v.Target != null);
    }
    

    This consistently gives 50 objects alive before GC and 0 objects alive after. Note the usage of TryStartNoGCRegion to prevent GC from running while creating the objects. If I change the program to ensure some objects are promoted to gen 1/2 and only collect gen 0, I get some surviving objects.

    So I would say my point still stands. GC will collect all objects in the generation it collects. And you are probably better of not messing with WeakReferences unless you have some specific use case.