Search code examples
.netperformancered-black-treesortedsetsorteddictionary

Performance: SortedDictionary vs SortedSet


Should I maintain

1.) a SortedDictionary(double,struct)
2.) or just a plain Dictionary(double,struct) plus a SortedSet(double)?

I just want fast insertions. I dont care about retrievals, as I'm not gonna do much lookups. I need sorted nature because, the only lookups I do are going to be on the maximum double or a few maximum doubles.

I feel time performance wise - Both are same, the SortedSet<double> just does extra work. Can you guys confirm?

The part I'm unaware of is whether to maintain the sorting, the SortedDictionary moves around just the keys (doubles), or both the keys and values. In the latter case 2.) will outperform 1.), isn't it?

Also, its not clear how SortedDictionary is internally implemented. Sortedset is red-black tree which is a proven performer.


Solution

  • SortedDictionary<K, V> is the way to go. Not just because it is the right structure for your usage, but even performance-wise and maintenance-wise it will be just better.


    I just want fast insertions

    1. In the second case you will have to insert both into Dictionary<K, V> as well as SortedSet<K>. That's two insertions (one O(1) and other O(log n)). I would expect it to be slower than single insertion to a SortedDictionary<K, V> (O(log n)).

    2. SortedDictionary<K, V> is internally implemented as a SortedSet<KeyValuePair<K, V>> with comparison being done on the Key part of the KeyValuePair<K, V>. So if you're satisfied with the performance of SortedSet<T>, then there should be no looking back.

    the sorteddictionary moves around just the keys (doubles), or both the keys and values

    This is clearly micro-optimization. It's just a matter of moving around few extra bytes and that will hardly matter.

    its not clear how sorteddictionary is internally implemented. Sortedset is red-black tree which is a proven performer.

    SortedDictionary<K, V> is internally implemented as a SortedSet<KeyValuePair<K, V>> with comparison being done on the Key part of the KeyValuePair<K, V>. It is a red-black tree. So that is proven performer too...


    Also note that SortedDictionary<K, V> will be lighter on memory, as well as, results in faster removal as well as enumeration. The Dictionary<K, V>/SortedSet<K> hybrid approach will give you faster lookups, but it will have to do a lookup for each key in the dictionary for corresponding value part during enumeration. This is going to be slower.


    Alert: I had not read your comments when I wrote the above !!

    the struct I am using is kinda heavy ~100 bytes

    If you can change it to class, do it. Moving around 100 bytes wont be nice if your app is performance critical one.

    I made a quick and dirty Dictionary<K, V>/SortedSet<K> hybrid structure and tested it.

    • Indeed it was faster for a 100 byte struct when it came to insertion (more than twice as fast). Surely there is a penalty (who would create a 100 byte struct?).

    • When I changed it to class, they both gave the same insertion performance.

    • When I shrunk the size of the struct, even then the insertion performance was comparable.

    So my suggestion is switch to a class and use SortedDictionary<K, V>. If you are stuck with the struct, then Dictionary<K, V>/SortedSet<K> will be better. Good q, and +1.