Search code examples
c#.netmultithreadingdouble-checked-locking

How to show that the double-checked-lock pattern with Dictionary's TryGetValue is not threadsafe


Recently I've seen some C# projects that use a double-checked-lock pattern on a Dictionary. Something like this:

private static readonly object _lock = new object();
private static volatile IDictionary<string, object> _cache = 
    new Dictionary<string, object>();

public static object Create(string key)
{
    object val;
    if (!_cache.TryGetValue(key, out val))
    {
        lock (_lock)
        {
            if (!_cache.TryGetValue(key, out val))
            {
                val = new object(); // factory construction based on key here.
                _cache.Add(key, val);
            }
        }
    }
    return val;
}

This code is incorrect, since the Dictionary can be "growing" the collection in _cache.Add() while _cache.TryGetValue (outside the lock) is iterating over the collection. It might be extremely unlikely in many situations, but is still wrong.

Is there a simple program to demonstrate that this code fails?

Does it make sense to incorporate this into a unit test? And if so, how?


Solution

  • In this example, exception #1 is thrown almost instantly on my machine:

    var dict = new Dictionary<int, string>() { { 1234, "OK" } };
    
    new Thread(() =>
    {
        for (; ; )
        {
            string s;
            if (!dict.TryGetValue(1234, out s))
            {
                throw new Exception();  // #1
            }
            else if (s != "OK")
            {
                throw new Exception();  // #2
            }
        }
    }).Start();
    
    Thread.Sleep(1000);
    Random r = new Random();
    for (; ; )
    {
        int k;
        do { k = r.Next(); } while (k == 1234);
        Debug.Assert(k != 1234);
        dict[k] = "FAIL";
    }
    

    However, the exact behaviour of code that is not designed to be thread-safe is unpredictable.
    You cannot rely on it. So the double-checking code is indeed broken.

    I'm not sure if I'd unit test this, though, as testing concurrent code (and getting it right) is much more complicated than writing the concurrent code in the first place.