Performance of Hashtable vs Dictionary<TKey, TValue>

Before you read: This question is about machine code and nanoseconds, because I do actually want to know it. Other criteria (like "Please use the generic way, because ...") are not what I'm asking for in this post.

Scenario

I want to add a few (let's say 4) values to a "hashmap similar data structure" and afterwards, I want to query, whether there is a value for a given key.

The 2 options we consider: Hashtable and Dictionary<T1,T2>

Goal

I want to minimize execution time (and optionally also IL size).

Experiment

The same operations for Dictionary and Hashtable:

using System;
using System.Collections;
using System.Collections.Generic;

public class Experiment {
    public void M() {
            var dictionary = new Dictionary<char, int>();
            dictionary.Add('a', 1);
            dictionary.Add('b', 5);
            dictionary.Add('c', 19);
            dictionary.Add('d', 92);
            var isInsideDictionary = dictionary.ContainsKey('e');
    }
    
    public void N(){
            var hashtable = new Hashtable();
            hashtable.Add('a', 1);
            hashtable.Add('b', 5);
            hashtable.Add('c', 19);
            hashtable.Add('d', 92);
            var isInsideTable = hashtable.ContainsKey('e');        
    }
}

Resulting Asm (x64)

Experiment..ctor()
    L0000: ret

Experiment.M()
    L0000: push rsi
    L0001: sub rsp, 0x20
    L0005: mov rcx, 0x7ff94cd46758
    L000f: call 0x00007ff9a0000b90
    L0014: mov rsi, rax
    L0017: mov rcx, rsi
    L001a: mov edx, 0x61
    L001f: mov r8d, 1
    L0025: mov r9d, 2
    L002b: mov rax, 0x7ff94cddf558
    L0035: call qword ptr [rax]
    L0037: mov rcx, rsi
    L003a: mov edx, 0x62
    L003f: mov r8d, 5
    L0045: mov r9d, 2
    L004b: mov rax, 0x7ff94cddf558
    L0055: call qword ptr [rax]
    L0057: mov rcx, rsi
    L005a: mov edx, 0x63
    L005f: mov r8d, 0x13
    L0065: mov r9d, 2
    L006b: mov rax, 0x7ff94cddf558
    L0075: call qword ptr [rax]
    L0077: mov rcx, rsi
    L007a: mov edx, 0x64
    L007f: mov r8d, 0x5c
    L0085: mov r9d, 2
    L008b: mov rax, 0x7ff94cddf558
    L0095: call qword ptr [rax]
    L0097: mov rcx, rsi
    L009a: mov edx, 0x65
    L009f: mov rax, 0x7ff94cddf528
    L00a9: call qword ptr [rax]
    L00ab: nop
    L00ac: add rsp, 0x20
    L00b0: pop rsi
    L00b1: ret

Experiment.N()
    L0000: push rdi
    L0001: push rsi
    L0002: push rbp
    L0003: push rbx
    L0004: sub rsp, 0x28
    L0008: vzeroupper
    L000b: mov rcx, 0x7ff940792378
    L0015: call 0x00007ff9a0000b90
    L001a: mov rsi, rax
    L001d: vmovss xmm2, [0x7ff94da20278]
    L0025: mov rcx, rsi
    L0028: xor edx, edx
    L002a: mov rax, 0x7ff94072d888
    L0034: call qword ptr [rax]
    L0036: mov rdi, 0x7ff9404863a0
    L0040: mov rcx, rdi
    L0043: call 0x00007ff9a0000b90
    L0048: mov rbx, rax
    L004b: mov word ptr [rbx+8], 0x61
    L0051: mov rbp, 0x7ff94045e8d0
    L005b: mov rcx, rbp
    L005e: call 0x00007ff9a0000b90
    L0063: mov dword ptr [rax+8], 1
    L006a: mov r8, rax
    L006d: mov rdx, rbx
    L0070: mov rcx, rsi
    L0073: mov r9d, 1
    L0079: mov rax, 0x7ff94072dc60
    L0083: call qword ptr [rax]
    L0085: mov rcx, rdi
    L0088: call 0x00007ff9a0000b90
    L008d: mov rbx, rax
    L0090: mov word ptr [rbx+8], 0x62
    L0096: mov rcx, rbp
    L0099: call 0x00007ff9a0000b90
    L009e: mov dword ptr [rax+8], 5
    L00a5: mov r8, rax
    L00a8: mov rdx, rbx
    L00ab: mov rcx, rsi
    L00ae: mov r9d, 1
    L00b4: mov rax, 0x7ff94072dc60
    L00be: call qword ptr [rax]
    L00c0: mov rcx, rdi
    L00c3: call 0x00007ff9a0000b90
    L00c8: mov rbx, rax
    L00cb: mov word ptr [rbx+8], 0x63
    L00d1: mov rcx, rbp
    L00d4: call 0x00007ff9a0000b90
    L00d9: mov dword ptr [rax+8], 0x13
    L00e0: mov r8, rax
    L00e3: mov rdx, rbx
    L00e6: mov rcx, rsi
    L00e9: mov r9d, 1
    L00ef: mov rax, 0x7ff94072dc60
    L00f9: call qword ptr [rax]
    L00fb: mov rcx, rdi
    L00fe: call 0x00007ff9a0000b90
    L0103: mov rbx, rax
    L0106: mov word ptr [rbx+8], 0x64
    L010c: mov rcx, rbp
    L010f: call 0x00007ff9a0000b90
    L0114: mov dword ptr [rax+8], 0x5c
    L011b: mov r8, rax
    L011e: mov rdx, rbx
    L0121: mov rcx, rsi
    L0124: mov r9d, 1
    L012a: mov rax, 0x7ff94072dc60
    L0134: call qword ptr [rax]
    L0136: mov rcx, rdi
    L0139: call 0x00007ff9a0000b90
    L013e: mov word ptr [rax+8], 0x65
    L0144: mov rdx, rax
    L0147: mov rcx, rsi
    L014a: mov rax, 0x7ff940792458
    L0154: call qword ptr [rax]
    L0156: nop
    L0157: add rsp, 0x28
    L015b: pop rbx
    L015c: pop rbp
    L015d: pop rsi
    L015e: pop rdi
    L015f: ret

Question

You can see that the number of ASM instructions is much lower when using Dictionary<T1,T2>. 37 vs 82...

Does that mean without any doubt, that you can say that doing what I have done is more performant with Dictionary<T1,T2> instead of Hashtable?

And is there any argument why you would not want to use Dictionary<T1,T2> instead of Hashtable considering performance? Does the generic approach result in more IL code, for example? Or does it behave differently when using reference types?

Solution

Microsoft claims that, for value types, the performance of Dictionary is better than that of Hashtable:

A Dictionary<TKey,TValue> of a specific type (other than Object) provides better performance than a Hashtable for value types. This is because the elements of Hashtable are of type Object; therefore, boxing and unboxing typically occur when you store or retrieve a value type.

Source: https://learn.microsoft.com/en-us/dotnet/standard/collections/hashtable-and-dictionary-collection-types

And is there any argument why you would not want to use Dictionary<T1,T2> instead of Hashtable considering performance?

In a single-threaded scenario: No. Hashtable's only purpose is backwards-compatibility.

In a multi-threaded scenario, note that Hashtable is thread-safe for multiple readers and a single writer, which Dictionary<T1, T2> is not. You'd need to lock access to the dictionary (which has its own performance cost) or use ConcurrentDictionary<T1, T2>, which provides thread-safety for multiple readers and multiple writers.