Search code examples
c#performancedictionary.net-6.0hashcode

How to update the value of a key in a dictionary, but only if the key already exists, without hashing the key twice?


I have a Dictionary<string, decimal> with a fixed number of entries, and I want to update many of its values frequently, but only for keys that already exist. If a key is not already in the dictionary, I don't want to add it, because my goal is to restrict the dictionary to a fixed size. So the code below (using the set indexer) will not work for my needs:

dictionary[key] = newValue;

This code will update the value of the key if it's already there, or it will insert a new key/value pair if it's not. That's not what I want. So I wrote an extension method TryUpdate for the Dictionary<TKey, TValue> class, with the desirable behavior:

public static bool TryUpdate<TKey, TValue>(
    this Dictionary<TKey, TValue> source,
    TKey key, TValue value)
{
    ArgumentNullException.ThrowIfNull(source);
    if (!source.ContainsKey(key))
        return false;
    source[key] = value;
    return true;
}

Now my code works as expected:

dictionary.TryUpdate(key, newValue);

What I don't like to the above implementation is that if the key already exists, which is the common case in my scenario, the key will be hashed twice in each TryUpdate operation. So in order to get the guarantee that my dictionary will not grow beyond its initial size, I am going to pay a doubled price in terms of performance. The keys are long strings, so the hashing cost is significant. Is there any way to rewrite my TryUpdate method, so that it is as performant as the set indexer?


Solution

  • Since .NET 6 there is no need to use wrappers - use CollectionsMarshal.GetValueRefOrNullRef instead:

    Gets either a reference to a TValue in the Dictionary<TKey, TValue> or a reference null if it does not exist in the dictionary.

    var x = new Dictionary<string, string>{{"test", "1"}};
    ref var valueRefOrNullRef = ref CollectionsMarshal.GetValueRefOrNullRef(x, "test");
    if (!Unsafe.IsNullRef(ref valueRefOrNullRef))
    {
        valueRefOrNullRef = "2";
    }
    
    Console.WriteLine(x["test"]); // prints 2
    

    UPD

    Full implementation:

    public static bool TryUpdate<TKey, TValue>(
        this Dictionary<TKey, TValue> source,
        TKey key,
        TValue value)
        where TKey : notnull
    {
        ArgumentNullException.ThrowIfNull(source);
        ref var valueRefOrNullRef = ref CollectionsMarshal.GetValueRefOrNullRef(source, key);
        if (Unsafe.IsNullRef(ref valueRefOrNullRef))
        {
            return false;
        }
    
        valueRefOrNullRef = value;
        return true;
    }
    
    var x = new Dictionary<string, decimal> { { "test", 1 } };
    Console.WriteLine(x.TryUpdate("test", 2)); // True
    Console.WriteLine(x.TryUpdate("test1", 2)); // False
    Console.WriteLine(x["test"]); // 2
    Console.WriteLine(x.ContainsKey("test2")); // False
    

    Note that this approach has next limitation (compared to the wrapper one):

    Items should not be added or removed from the Dictionary<TKey, TValue> while the ref TValue is in use.

    UPD2

    Just for the funsies created a small benchmark (not ideal of course, it heavily misses at least on the string and dictionary size variance) but it seems that even for small dictionaries and small/medium string lengths approach with single access by key is faster:

    Method Mean Error StdDev
    BothUsingCollectionsMarshal 204.92 ns 2.008 ns 1.677 ns
    BothUsingDoubleAccess 355.74 ns 5.947 ns 6.107 ns
    OnlyPresentUsingCollectionsMarshal 116.19 ns 2.225 ns 2.081 ns
    OnlyPresentUsingDoubleAccess 271.66 ns 5.423 ns 5.326 ns
    OnlyMissingUsingCollectionsMarshal 91.20 ns 1.851 ns 1.818 ns
    OnlyMissingUsingDoubleAccess 94.78 ns 1.891 ns 2.524 ns