c#compiler-construction static-variables

What is the implementation difference between static variable and static field?

This question is from compiler implementation perspective.

I wondered about static variables in C# and I found the explanation why they are not implemented (here: http://blogs.msdn.com/b/csharpfaq/archive/2004/05/11/why-doesn-t-c-support-static-method-variables.aspx ).

Quote "it is possible to get nearly the same effect by having a class-level static" -- and this made me curious, what is difference? Let's say C# would have static variable syntax -- the implementation could be "push this silently as static field and leave conditional initialization (if necessary)". Done.

The only thing I can spot is the problem with value type with given initialization. Is there anything else that fits into "nearly"?

I rephrase the question -- how to implement static variables in C# compilers using only existing features (so static variable has to be internally made in current state terms).

Solution

It's actually very easy to check what the compiler would have to do to implement static variables in C#.

C# is designed to be compiled to CIL (Common Intermediate Language). C++, which supports static variables, can also be compiled to CIL.

Let's see what happens when we do it. First, let's consider the following simple class:

public ref class Class1
{
private:
    static int i = 0;

public:
    int M() {
        static int i = 0;
        i++;
        return i;
    }

    int M2() {
        i++;
        return i;
    }
};

}

Two methods, same behavior - i initialized to 0, incremented and returned each time the methods are called. Let's compare the IL.

.method public hidebysig instance int32  M() cil managed
{
  // Code size       20 (0x14)
  .maxstack  2
  .locals ([0] int32 V_0)
  IL_0000:  ldsfld     int32 '?i@?1??M@Class1@CppClassLibrary@@Q$AAMHXZ@4HA'
  IL_0005:  ldc.i4.1
  IL_0006:  add
  IL_0007:  stsfld     int32 '?i@?1??M@Class1@CppClassLibrary@@Q$AAMHXZ@4HA'
  IL_000c:  ldsfld     int32 '?i@?1??M@Class1@CppClassLibrary@@Q$AAMHXZ@4HA'
  IL_0011:  stloc.0
  IL_0012:  ldloc.0
  IL_0013:  ret
} // end of method Class1::M

.method public hidebysig instance int32  M2() cil managed
{
  // Code size       20 (0x14)
  .maxstack  2
  .locals ([0] int32 V_0)
  IL_0000:  ldsfld     int32 CppClassLibrary.Class1::i
  IL_0005:  ldc.i4.1
  IL_0006:  add
  IL_0007:  stsfld     int32 CppClassLibrary.Class1::i
  IL_000c:  ldsfld     int32 CppClassLibrary.Class1::i
  IL_0011:  stloc.0
  IL_0012:  ldloc.0
  IL_0013:  ret
} // end of method Class1::M2

The same. The only difference is the field name. It uses characters that are legal in CIL, but illegal in C++ so that the same name cannot be used in C++ code. C# compilers use this trick very often for auto-generated fields. The only difference is that the static variable cannot be accessed via reflection - I don't know how it's done.

Let's move to a more interesting example.

int M3(int a) {
    static int i = a;
    i++;
    return i;
}

Now the fun begins. The static variable cannot be initialized at compile-time anymore. It has to be done at run-time. And the compiler has to make sure it's only initialized once, so it has to be thread-safe.

The resulting CIL is

.method public hidebysig instance int32  M3(int32 a) cil managed
{
  // Code size       73 (0x49)
  .maxstack  2
  .locals ([0] int32 V_0)
  IL_0000:  ldsflda    int32 '?$TSS0@?1??M3@Class1@CppClassLibrary@@Q$AAMHH@Z@4HA'                                              
  IL_0005:  call       void _Init_thread_header_m(int32 modreq([mscorlib]System.Runtime.CompilerServices.IsVolatile)*)
  IL_000a:  ldsfld     int32 '?$TSS0@?1??M3@Class1@CppClassLibrary@@Q$AAMHH@Z@4HA'
  IL_000f:  ldc.i4.m1
  IL_0010:  bne.un.s   IL_0035
  .try
  {
    IL_0012:  ldarg.1
    IL_0013:  stsfld     int32 '?i@?1??M3@Class1@CppClassLibrary@@Q$AAMHH@Z@4HA'
    IL_0018:  leave.s    IL_002b
  }  // end .try
  fault
  {
    IL_001a:  ldftn      void _Init_thread_abort_m(int32 modreq([mscorlib]System.Runtime.CompilerServices.IsVolatile)*)
    IL_0020:  ldsflda    int32 '?$TSS0@?1??M3@Class1@CppClassLibrary@@Q$AAMHH@Z@4HA'
    IL_0025:  call       void ___CxxCallUnwindDtor(method void *(void*),
                                                   void*)
    IL_002a:  endfinally
  }  // end handler
  IL_002b:  ldsflda    int32 '?$TSS0@?1??M3@Class1@CppClassLibrary@@Q$AAMHH@Z@4HA'
  IL_0030:  call       void _Init_thread_footer_m(int32 modreq([mscorlib]System.Runtime.CompilerServices.IsVolatile)*)
  IL_0035:  ldsfld     int32 '?i@?1??M3@Class1@CppClassLibrary@@Q$AAMHH@Z@4HA'
  IL_003a:  ldc.i4.1
  IL_003b:  add
  IL_003c:  stsfld     int32 '?i@?1??M3@Class1@CppClassLibrary@@Q$AAMHH@Z@4HA'
  IL_0041:  ldsfld     int32 '?i@?1??M3@Class1@CppClassLibrary@@Q$AAMHH@Z@4HA'
  IL_0046:  stloc.0
  IL_0047:  ldloc.0
  IL_0048:  ret
} // end of method Class1::M3

Looks much more complicated. A second static field, something that looks like a critical section (although I can't find any information about the _Init_thread_* methods).

It doesn't look so trivial anymore. Performance suffers too. IMHO, it was a good decision not to implement static variables in C#.

To summarize,

To support static variables the C# compiler would have to:

Create a private static field for the variable, making sure the name is unique and cannot be used directly in C# code.
Make this field invisible via reflection.
If the initialization cannot be done at compile-time, make it thread-safe.

It doesn't seem much, but if you combine several features like this one, the complexity rises exponentially.

And the only thing you get in return is an easy, compiler-provided, thread-safe initialization.

It's not a good idea to add a feature to a language only because other languages support it. Add the feature when it's really needed. The C# design team already made this mistake with array covariance