Search code examples
c#memory-managementsorteddictionary

Why does sortedDictionary need so much overhead?


long b = GC.GetTotalMemory(true);
SortedDictionary<int, int> sd = new SortedDictionary<int, int>();
for (int i = 0; i < 10000; i++)
{
    sd.Add(i, i+1);
}
long a = GC.GetTotalMemory(true);
Console.WriteLine((a - b));            
int reference = sd[10];    

output (32 bit):

280108

output (64 bit):

480248

Storing the ints alone (in an array) would require about 80000.


Solution

  • Internally SortedDictionary<TKey, TValue> uses TreeSet<KeyValuePair<TKey, TValue>>. The tree uses Node<T> and obviously it uses references between nodes, so in addition to each key and value you will have references to left and right nodes as well as some additional properties. Node<T> is a class so each instance has the traditional overhead of a reference type. Furthermore, each node also stores a boolean called IsRed.

    According to WinDbg/SOS the size of a single node on 64 bits in 48 bytes so 10000 of them will take up at least 480000 bytes.

    0:000> !do 0000000002a51d90 
    Name:            System.Collections.Generic.SortedSet`1+Node[[System.Collections.Generic.KeyValuePair`2[[System.Int32, mscorlib],[System.Int32, mscorlib]], mscorlib]]
    MethodTable: 000007ff000282c8
    EEClass:     000007ff00133b18
    Size:        48(0x30) bytes     <== Size of a single node
    File:            C:\windows\Microsoft.Net\assembly\GAC_MSIL\System\v4.0_4.0.0.0__b77a5c561934e089\System.dll
    Fields:
                  MT    Field   Offset                 Type VT     Attr            Value     Name
    000007feeddfd6e8  4000624       18       System.Boolean  1 instance                0     IsRed
    000007feee7fd3b8  4000625       20 ...Int32, mscorlib]]  1 instance 0000000002a51db0 Item
    000007ff000282c8  4000626        8 ...lib]], mscorlib]]  0 instance 0000000002a39d90 Left
    000007ff000282c8  4000627       10 ...lib]], mscorlib]]  0 instance 0000000002a69d90 Right