Search code examples
c#compiler-constructionstatic-variables

What is the implementation difference between static variable and static field?


This question is from compiler implementation perspective.

I wondered about static variables in C# and I found the explanation why they are not implemented (here: http://blogs.msdn.com/b/csharpfaq/archive/2004/05/11/why-doesn-t-c-support-static-method-variables.aspx ).

Quote "it is possible to get nearly the same effect by having a class-level static" -- and this made me curious, what is difference? Let's say C# would have static variable syntax -- the implementation could be "push this silently as static field and leave conditional initialization (if necessary)". Done.

The only thing I can spot is the problem with value type with given initialization. Is there anything else that fits into "nearly"?

I rephrase the question -- how to implement static variables in C# compilers using only existing features (so static variable has to be internally made in current state terms).


Solution

  • It's actually very easy to check what the compiler would have to do to implement static variables in C#.

    C# is designed to be compiled to CIL (Common Intermediate Language). C++, which supports static variables, can also be compiled to CIL.

    Let's see what happens when we do it. First, let's consider the following simple class:

    public ref class Class1
    {
    private:
        static int i = 0;
    
    public:
        int M() {
            static int i = 0;
            i++;
            return i;
        }
    
        int M2() {
            i++;
            return i;
        }
    };
    

    }

    Two methods, same behavior - i initialized to 0, incremented and returned each time the methods are called. Let's compare the IL.

    .method public hidebysig instance int32  M() cil managed
    {
      // Code size       20 (0x14)
      .maxstack  2
      .locals ([0] int32 V_0)
      IL_0000:  ldsfld     int32 '?i@?1??M@Class1@CppClassLibrary@@Q$AAMHXZ@4HA'
      IL_0005:  ldc.i4.1
      IL_0006:  add
      IL_0007:  stsfld     int32 '?i@?1??M@Class1@CppClassLibrary@@Q$AAMHXZ@4HA'
      IL_000c:  ldsfld     int32 '?i@?1??M@Class1@CppClassLibrary@@Q$AAMHXZ@4HA'
      IL_0011:  stloc.0
      IL_0012:  ldloc.0
      IL_0013:  ret
    } // end of method Class1::M
    
    .method public hidebysig instance int32  M2() cil managed
    {
      // Code size       20 (0x14)
      .maxstack  2
      .locals ([0] int32 V_0)
      IL_0000:  ldsfld     int32 CppClassLibrary.Class1::i
      IL_0005:  ldc.i4.1
      IL_0006:  add
      IL_0007:  stsfld     int32 CppClassLibrary.Class1::i
      IL_000c:  ldsfld     int32 CppClassLibrary.Class1::i
      IL_0011:  stloc.0
      IL_0012:  ldloc.0
      IL_0013:  ret
    } // end of method Class1::M2
    

    The same. The only difference is the field name. It uses characters that are legal in CIL, but illegal in C++ so that the same name cannot be used in C++ code. C# compilers use this trick very often for auto-generated fields. The only difference is that the static variable cannot be accessed via reflection - I don't know how it's done.

    Let's move to a more interesting example.

    int M3(int a) {
        static int i = a;
        i++;
        return i;
    }
    

    Now the fun begins. The static variable cannot be initialized at compile-time anymore. It has to be done at run-time. And the compiler has to make sure it's only initialized once, so it has to be thread-safe.

    The resulting CIL is

    .method public hidebysig instance int32  M3(int32 a) cil managed
    {
      // Code size       73 (0x49)
      .maxstack  2
      .locals ([0] int32 V_0)
      IL_0000:  ldsflda    int32 '?$TSS0@?1??M3@Class1@CppClassLibrary@@Q$AAMHH@Z@4HA'                                              
      IL_0005:  call       void _Init_thread_header_m(int32 modreq([mscorlib]System.Runtime.CompilerServices.IsVolatile)*)
      IL_000a:  ldsfld     int32 '?$TSS0@?1??M3@Class1@CppClassLibrary@@Q$AAMHH@Z@4HA'
      IL_000f:  ldc.i4.m1
      IL_0010:  bne.un.s   IL_0035
      .try
      {
        IL_0012:  ldarg.1
        IL_0013:  stsfld     int32 '?i@?1??M3@Class1@CppClassLibrary@@Q$AAMHH@Z@4HA'
        IL_0018:  leave.s    IL_002b
      }  // end .try
      fault
      {
        IL_001a:  ldftn      void _Init_thread_abort_m(int32 modreq([mscorlib]System.Runtime.CompilerServices.IsVolatile)*)
        IL_0020:  ldsflda    int32 '?$TSS0@?1??M3@Class1@CppClassLibrary@@Q$AAMHH@Z@4HA'
        IL_0025:  call       void ___CxxCallUnwindDtor(method void *(void*),
                                                       void*)
        IL_002a:  endfinally
      }  // end handler
      IL_002b:  ldsflda    int32 '?$TSS0@?1??M3@Class1@CppClassLibrary@@Q$AAMHH@Z@4HA'
      IL_0030:  call       void _Init_thread_footer_m(int32 modreq([mscorlib]System.Runtime.CompilerServices.IsVolatile)*)
      IL_0035:  ldsfld     int32 '?i@?1??M3@Class1@CppClassLibrary@@Q$AAMHH@Z@4HA'
      IL_003a:  ldc.i4.1
      IL_003b:  add
      IL_003c:  stsfld     int32 '?i@?1??M3@Class1@CppClassLibrary@@Q$AAMHH@Z@4HA'
      IL_0041:  ldsfld     int32 '?i@?1??M3@Class1@CppClassLibrary@@Q$AAMHH@Z@4HA'
      IL_0046:  stloc.0
      IL_0047:  ldloc.0
      IL_0048:  ret
    } // end of method Class1::M3
    

    Looks much more complicated. A second static field, something that looks like a critical section (although I can't find any information about the _Init_thread_* methods).

    It doesn't look so trivial anymore. Performance suffers too. IMHO, it was a good decision not to implement static variables in C#.

    To summarize,

    To support static variables the C# compiler would have to:

    1. Create a private static field for the variable, making sure the name is unique and cannot be used directly in C# code.
    2. Make this field invisible via reflection.
    3. If the initialization cannot be done at compile-time, make it thread-safe.

    It doesn't seem much, but if you combine several features like this one, the complexity rises exponentially.

    And the only thing you get in return is an easy, compiler-provided, thread-safe initialization.

    It's not a good idea to add a feature to a language only because other languages support it. Add the feature when it's really needed. The C# design team already made this mistake with array covariance