Search code examples
c#.netstaticclr

Why the CLR keeps checking static members for type constructor invocation even after the constructor invoked?


I understand that when a type declares an explicit static constructor, the just-in-time (JIT) compiler adds a check to each static method and instance constructor of the type to make sure that the static constructor was previously called.

This behaviour I can imagine it as the following code (correct me please if I’m wrong with this conclusion):

class  ExplicitConstructor
    {
        private static string myVar;

        // Force “precise” initialization 
        static ExplicitConstructor() { myVar = "hello, world";} 
    
        
        /* CLR: if the type constructor didn't invoked 
                then add a call to the type constructor */
        public static string MyProperty
        {
            get { return myVar; }
        }
        
        /* CLR: if the type constructor didn't invoked 
                then add a call to the type constructor */
        public ExplicitConstructor()
        {
            Console.WriteLine("In instance ctor");
        }

    }

    class ImplicitConstructor 
    { 
        private static string myVar = "hello, world";
        
        public static string MyProperty
        {
            /* CLR: Invoke the type constructor only here */
            get { return myVar; }
        }
        
        public ImplicitConstructor()
        {
            Console.WriteLine("In instance ctor");
        }
    }

According to performance rules, this behaive has an impact on the performance because of the checks that the runtime performs in order to run the type initializer at a precise time.

[MemoryDiagnoser]
[Orderer(SummaryOrderPolicy.FastestToSlowest)]
[RankColumn]
public class BenchmarkExample
{
    public const int iteration = Int32.MaxValue - 1;

    [Benchmark]
    public void BenchExplicitConstructor()
    {
        for (int i = 0; i < iteration; i++)
        {
            var temp = ExplicitConstructor.MyProperty;
        }
    }

    [Benchmark]
    public void BenchImplicitConstructor()
    {
        for (int i = 0; i < iteration; i++)
        {
            var temp = ImplicitConstructor.MyProperty;
        }
    }

}
Method Mean Error StdDev Rank Allocated
BenchImplicitConstructor 982.6 ms 56.64 ms 163.4 ms 1 -
BenchExplicitConstructor 7,361.4 ms 318.19 ms 933.2 ms 2 -

Why did the CLR, instead of adding a check to each static method/instance constructor of the type to make sure that the type constructor was previously called, check if the type constructor had been invoked (only once)? 


Solution

  • The cost of checking that the static constructor has been called in ExplicitConstructor is exaggerated as the JIT has failed to optimise the check in the benchmark method - as shown with the JITd assembly generated by the BenchmarkDotNet DisassemblyDiagnoser.

    ; BenchExplicitConstructor()
           push      rsi
           sub       rsp,20
           xor       esi,esi
    
    M00_L00: ; Hot loop
           mov       rcx,7FFE299E97C0
           mov       edx,6
           call      CORINFO_HELP_GETSHARED_NONGCSTATIC_BASE ; static check
           inc       esi
           cmp       esi,7FFFFFFE
           jl        short M00_L00
    
           add       rsp,20
           pop       rsi
           ret
    

    Helping it out by ensuring ExplicitConstructor has been checked (and initialized in this case) before hitting the hot loop results in parity.

    [Benchmark]
    public void BenchExplicitConstructor()
    {
        _ = ExplicitConstructor.MyProperty; // Explicitly check the class
    
        for (int i = 0; i < iteration; i++)
        {
            var temp = ExplicitConstructor.MyProperty;
        }
    }
    
    Method Mean Error StdDev Rank Code Size Allocated
    BenchImplicitConstructor 672.8 ms 0.23 ms 0.18 ms 1 43 B 3,992 B
    BenchExplicitConstructor 673.6 ms 1.19 ms 0.93 ms 1 40 B 384 B

    Why didn't the JIT do this by itself? In the context of one small piece of code, the appropriate heuristics weren't available for it to make the optimisation. It's highly unlikely this would be missed in the context of a normal program.

    Update:

    This is a known JIT issue covered by #1327, where the discussion notes that methods containing loops skip quick JITing by default (see tiered compilation). This means an initial compilation that didn't hoist the static check (lift it out the loop body) is locked in for the lifetime of the program.

    While this performance issue is currently unresolved, a cleaner work-around is mentioned in a related feature request; using a Fody attribute rather than a non-obvious call to manually hoist.