Search code examples
c#stackprimitive

Primitive type immutability and stack?


I was reading 'CLR via C#' and while going over 'how things relate at runtime' and 'primitive, reference and value types' I got confused a bit. If I have the below code called from Main -

void DoSomething(int x) 
{
  int m = x/2;
  int n = SomeMethod1(m);
      n = (n * 2) + x; 
  int k = SomeMethod2(n);
      m += (k*3);
}

The code does nothing useful but I am just trying to understand the behavior of the local integer variables and memory allocation.

I understand m, n and k reside on the stack? Now, for the last line (I'm ignoring the variable n) the value of 'm' has to be modified. So everything else on the stack has to be popped off and the value of 'm' be updated? Is this negligible compared to the overhead of boxing and unboxing on heap? Will it become significant if the stack height is more?

P.S: Am excluding GC from discussion (for boxing and unboxing) which is definitely an overhead and also this is only to try and understand the behavior and may/may not be a practical scenario.


Solution

  • No, there's nothing saying m, n or k should be on the stack. Most likely, they'll be in registers, and since you're not even using inter-dependence, they don't even have to be there at the same time. The JIT compiler is very good at runtime optimization, the stack is hardly ever used for anything but method parameters (or their references). And GC has no power in either the registers or the stack - the stack is deallocated the same way as in native code. Only the heap is a domain of the GC.

    This is a common misunderstanding, because in IL (the intermediate language C# compiles into) actually does everything using the stack. However, that is not what executes on your machine - the IL code is compiled yet again by the JIT compiler. Of course, the JIT compiler is free to do everything on the stack as well, but that would be silly. Simple example time:

    x + y
    

    Would compile to this IL (in pseudo-code):

    push y
    push x
    op_Add
    

    So for a simple integer addition, it needs to do a method call and two pushes to stack. Not terribly expensive, but if you're trying to do any serious calculation, you're in trouble.

    However, the JIT compiler will produce something more like this:

    add ax, bx
    

    It's very smart about this, so it in fact often uses registers even for method parameters - if it's safe.

    However, note that all of this is just an implementation detail - in this case, a performance optimization. An integer might just as easily live on the heap (and in fact, it does if it's e.g. boxed or a part of another object that lives on the heap).

    So, obviously, the fastest things are those that are supported inside the CPU itself - as in the example above, adding two registers together. Very fast.

    Using the stack is usually still very fast. It's not that the allocation to the heap is expensive - in fact, heap and stack allocations are about the same cost. It's the deallocation that hurts - the heap needs to be garbage collected and compacted, while the stack only needs a single pointer change.

    Oh, and you misunderstand the way you can access the stack. True, there's the basic push and pop instructions, however, those are not the only way. You can directly address the memory of the stack the same way you can address any other piece of memory. In fact, that's the whole point why you can see the stack pointers (like ESP).