How does an implementation of the c sharp specification ensure static constructors are executed in a threadsafe manner?

The c# static constructors is guaranteed to execute only once. Therefore, if I have say ten threads accessing a member of class A, and the static constructor of A hasn't been run, and the static constructor of A takes 10 seconds to run, these threads will block for ten seconds.

This seems amazing to me - how is this achieved within the JIT/CLR? Does every access to a static field enter a lock, check if the static constructor is initalized, then initialize it if it isn't? Wouldn't this be very slow?

To be clear, I want to know how an implementation of the specification achieves this. I know that static constructors are threadsafe, this question is not asking that. It is asking how the implementation ensures this, and whether it uses locks and checks under the hood (these locks are not locks in c sharp, rather locks used by the JIT/CLR/other implementation).

Solution

Let's first review the the different kinds of static constructors and the rules that specify when each must be executed. There are two kinds of static constructors: Precise and BeforeFieldInit. Static constructors that are explicitly defined are precise. If a class has initialized static fields without an explicitly defined static constructor, then the managed language compiler defines one that performs the initialization of these static fields. Precise constructors must execute just before accessing any field or calling any method of the type. BeforeFieldInit constructors must execute before the first static field access. Now I'll discuss when and how static constructors are called in CoreCLR and CLR.

When a method is called for the first time, a temporary entry point for that method gets called, which is mainly responsible for the JITing the IL code of the method. The temporary entry point (specifically, the prestub) checks the kind of the static constructor of the type of the method being called (irrespective of whether that method is instance of static). If it's Precise, then the temporary entry point ensures that the static constructor of that type has been executed.

The temporary entry point then invokes the JIT compiler to emit the native code of the method (since it's being called for the first time). The JIT compiler checks if the IL of the method includes accesses to static fields. For each accessed static field, if the static constructor of the type that defines that static field is BeforeFieldInit, then the compiler ensures that the static constructor of the type has been executed. Therefore, the native code of the method does not include any calls to the static constructor. Otherwise, if the static constructor of the type that defines that static field is Precise, the JIT compiler injects calls to the static constructor before every access to the static field in the native code of the method.

Static constructors are executed by calling CheckRunClassInitThrowing. This function basically checks whether the type has already been initialized, and if not, it calls DoRunClassInitThrowing, which is the one that actually calls the static constructor. Before calling a static constructor, the lock associated with that constructor needs to be acquired. There is one such lock for each type. However, these locks are created lazily. That is, only when the static constructor of a type gets called is a lock created for that type. Therefore, a list of locks needs to be maintained dynamically per appdomain and this list itself needs to be protected by a lock. So calling a static constructor involves two locks: an appdomain-specific lock and a type-specific lock. The following code shows how these two locks get acquired and released (some comments are mine).

void MethodTable::DoRunClassInitThrowing()
{

    .
    .
    .

    ListLock *_pLock = pDomain->GetClassInitLock();

    // Acquire the appdomain lock.
    ListLockHolder pInitLock(_pLock);

    .
    .
    .

    // Take the lock
    {
        // Get the lock associated with the static constructor or create new a lock if one has not been created yet.
        ListLockEntryHolder pEntry(ListLockEntry::Find(pInitLock, this, description));

        ListLockEntryLockHolder pLock(pEntry, FALSE);

        // We have a list entry, we can release the global lock now
        pInitLock.Release();

        // Acquire the constructor lock.
        // Block if another thread has the lock.
        if (pLock.DeadlockAwareAcquire())
        {
            .
            .
            .
        }

        // The constructor lock gets released by calling the destructor of pEntry.
        // The compiler itself emits a call to the destructor at the end of the block
        // since pEntry is an automatic variable.
    }

    .
    .
    .

}

Static constructors of appdomain-neutral types and NGEN'ed types are handled differently. In addition, the CoreCLR implementation does not strictly adhere to the semantics of Precise constructors for performance reasons. For more information, refer to the comment at the top of corinfo.h.