Search code examples
c#.netperformancedictionarygenerics

Performance of Hashtable vs Dictionary<TKey, TValue>


Before you read: This question is about machine code and nanoseconds, because I do actually want to know it. Other criteria (like "Please use the generic way, because ...") are not what I'm asking for in this post.

Scenario

I want to add a few (let's say 4) values to a "hashmap similar data structure" and afterwards, I want to query, whether there is a value for a given key.

The 2 options we consider: Hashtable and Dictionary<T1,T2>

Goal

I want to minimize execution time (and optionally also IL size).

Experiment

The same operations for Dictionary and Hashtable:

using System;
using System.Collections;
using System.Collections.Generic;

public class Experiment {
    public void M() {
            var dictionary = new Dictionary<char, int>();
            dictionary.Add('a', 1);
            dictionary.Add('b', 5);
            dictionary.Add('c', 19);
            dictionary.Add('d', 92);
            var isInsideDictionary = dictionary.ContainsKey('e');
    }
    
    public void N(){
            var hashtable = new Hashtable();
            hashtable.Add('a', 1);
            hashtable.Add('b', 5);
            hashtable.Add('c', 19);
            hashtable.Add('d', 92);
            var isInsideTable = hashtable.ContainsKey('e');        
    }
}

Resulting Asm (x64)

Experiment..ctor()
    L0000: ret

Experiment.M()
    L0000: push rsi
    L0001: sub rsp, 0x20
    L0005: mov rcx, 0x7ff94cd46758
    L000f: call 0x00007ff9a0000b90
    L0014: mov rsi, rax
    L0017: mov rcx, rsi
    L001a: mov edx, 0x61
    L001f: mov r8d, 1
    L0025: mov r9d, 2
    L002b: mov rax, 0x7ff94cddf558
    L0035: call qword ptr [rax]
    L0037: mov rcx, rsi
    L003a: mov edx, 0x62
    L003f: mov r8d, 5
    L0045: mov r9d, 2
    L004b: mov rax, 0x7ff94cddf558
    L0055: call qword ptr [rax]
    L0057: mov rcx, rsi
    L005a: mov edx, 0x63
    L005f: mov r8d, 0x13
    L0065: mov r9d, 2
    L006b: mov rax, 0x7ff94cddf558
    L0075: call qword ptr [rax]
    L0077: mov rcx, rsi
    L007a: mov edx, 0x64
    L007f: mov r8d, 0x5c
    L0085: mov r9d, 2
    L008b: mov rax, 0x7ff94cddf558
    L0095: call qword ptr [rax]
    L0097: mov rcx, rsi
    L009a: mov edx, 0x65
    L009f: mov rax, 0x7ff94cddf528
    L00a9: call qword ptr [rax]
    L00ab: nop
    L00ac: add rsp, 0x20
    L00b0: pop rsi
    L00b1: ret

Experiment.N()
    L0000: push rdi
    L0001: push rsi
    L0002: push rbp
    L0003: push rbx
    L0004: sub rsp, 0x28
    L0008: vzeroupper
    L000b: mov rcx, 0x7ff940792378
    L0015: call 0x00007ff9a0000b90
    L001a: mov rsi, rax
    L001d: vmovss xmm2, [0x7ff94da20278]
    L0025: mov rcx, rsi
    L0028: xor edx, edx
    L002a: mov rax, 0x7ff94072d888
    L0034: call qword ptr [rax]
    L0036: mov rdi, 0x7ff9404863a0
    L0040: mov rcx, rdi
    L0043: call 0x00007ff9a0000b90
    L0048: mov rbx, rax
    L004b: mov word ptr [rbx+8], 0x61
    L0051: mov rbp, 0x7ff94045e8d0
    L005b: mov rcx, rbp
    L005e: call 0x00007ff9a0000b90
    L0063: mov dword ptr [rax+8], 1
    L006a: mov r8, rax
    L006d: mov rdx, rbx
    L0070: mov rcx, rsi
    L0073: mov r9d, 1
    L0079: mov rax, 0x7ff94072dc60
    L0083: call qword ptr [rax]
    L0085: mov rcx, rdi
    L0088: call 0x00007ff9a0000b90
    L008d: mov rbx, rax
    L0090: mov word ptr [rbx+8], 0x62
    L0096: mov rcx, rbp
    L0099: call 0x00007ff9a0000b90
    L009e: mov dword ptr [rax+8], 5
    L00a5: mov r8, rax
    L00a8: mov rdx, rbx
    L00ab: mov rcx, rsi
    L00ae: mov r9d, 1
    L00b4: mov rax, 0x7ff94072dc60
    L00be: call qword ptr [rax]
    L00c0: mov rcx, rdi
    L00c3: call 0x00007ff9a0000b90
    L00c8: mov rbx, rax
    L00cb: mov word ptr [rbx+8], 0x63
    L00d1: mov rcx, rbp
    L00d4: call 0x00007ff9a0000b90
    L00d9: mov dword ptr [rax+8], 0x13
    L00e0: mov r8, rax
    L00e3: mov rdx, rbx
    L00e6: mov rcx, rsi
    L00e9: mov r9d, 1
    L00ef: mov rax, 0x7ff94072dc60
    L00f9: call qword ptr [rax]
    L00fb: mov rcx, rdi
    L00fe: call 0x00007ff9a0000b90
    L0103: mov rbx, rax
    L0106: mov word ptr [rbx+8], 0x64
    L010c: mov rcx, rbp
    L010f: call 0x00007ff9a0000b90
    L0114: mov dword ptr [rax+8], 0x5c
    L011b: mov r8, rax
    L011e: mov rdx, rbx
    L0121: mov rcx, rsi
    L0124: mov r9d, 1
    L012a: mov rax, 0x7ff94072dc60
    L0134: call qword ptr [rax]
    L0136: mov rcx, rdi
    L0139: call 0x00007ff9a0000b90
    L013e: mov word ptr [rax+8], 0x65
    L0144: mov rdx, rax
    L0147: mov rcx, rsi
    L014a: mov rax, 0x7ff940792458
    L0154: call qword ptr [rax]
    L0156: nop
    L0157: add rsp, 0x28
    L015b: pop rbx
    L015c: pop rbp
    L015d: pop rsi
    L015e: pop rdi
    L015f: ret

Question

You can see that the number of ASM instructions is much lower when using Dictionary<T1,T2>. 37 vs 82...

Does that mean without any doubt, that you can say that doing what I have done is more performant with Dictionary<T1,T2> instead of Hashtable?

And is there any argument why you would not want to use Dictionary<T1,T2> instead of Hashtable considering performance? Does the generic approach result in more IL code, for example? Or does it behave differently when using reference types?


Solution

  • Microsoft claims that, for value types, the performance of Dictionary is better than that of Hashtable:

    A Dictionary<TKey,TValue> of a specific type (other than Object) provides better performance than a Hashtable for value types. This is because the elements of Hashtable are of type Object; therefore, boxing and unboxing typically occur when you store or retrieve a value type.

    Source: https://learn.microsoft.com/en-us/dotnet/standard/collections/hashtable-and-dictionary-collection-types


    And is there any argument why you would not want to use Dictionary<T1,T2> instead of Hashtable considering performance?

    In a single-threaded scenario: No. Hashtable's only purpose is backwards-compatibility.

    In a multi-threaded scenario, note that Hashtable is thread-safe for multiple readers and a single writer, which Dictionary<T1, T2> is not. You'd need to lock access to the dictionary (which has its own performance cost) or use ConcurrentDictionary<T1, T2>, which provides thread-safety for multiple readers and multiple writers.