Search code examples
c#c++-cliclrboxing

Why C# does not modify fields directly in the managed heap like C++/CLI?


As I see in the book C# via CLR writtern by Jeffrey Richter, when talking about boxing and unboxing/copying operations, Mr Richter shows a demo:

public static void Main() {
    Point p; // Point is a value type defined before
    p.x = p.y = 1;
    Object o = p; // Boxes p; o refers to the boxed instance

    // Change Point's x field to 2
    p = (Point) o; // Unboxes o AND copies fields from boxed
    // instance to stack variable
    p.x = 2; // Changes the state of the stack variable
    o = p; // Boxes p; o refers to a new boxed instance
}

In order to change the value of a boxed instance, we need to do an unboxing-then-boxing-again operation, the performance is not that good.

To avoid my misunderstanding of book, here is the exactly what the autor says:

However, in some languages like C++/CLI, they allow you to unbox a boxed value type without copying the fields.Unboxing returns the address of the unboxed portion of a boxed object (ignoring the object’s type object pointer and sync block index overhead). You can now use this pointer to manipulate the unboxed instance’s fields (which happen to be in a boxed object on the heap). For example, the previous code would be much more efficient if written in C++/CLI, because you could change the value of Point’s x field within the already boxed Point instance. This would avoid both allocating a new object on the heap and copying all of the fields twice!

Here comes my question: Why C# does not do what C++/CLR does? Is there some other important issues prevent C# team to do this "redundant" design and make the performance not good as we expect? What's behind this "unnatural" behavior?


Solution

  • C# and C++ have somewhat different philosophies. Unlike C++, which is a system programming language, C# wants to save the programmer from dealing with memory allocation and pointers, so it tries to hide some of the details involved in memory management. In C#, value types are equivalent to primitive types and structs in C++. They are just a sequence of bytes somewhere in memory. In C++, you can easily declare a pointer to an int or struct, wherever it may reside, but C# won't allow that. This indirection is hidden inside the boxing/unboxing mechanism.

    "Boxing" roughly means that the value type is created on the heap, and an object reference to this memory is silently returned. Unboxing simply reverses this indirection. Boxing/unboxing occurs automatically whenever you use a value type in a context where a reference type is expected - for instance, when casting an integer to an Object:

    Int32 i4Value = 12345;
    Object pValue = (Object) i4Value;
    

    Casting an Object reference back to an integer automatically unboxes the value, i.e. copies it from the heap to a normal value.

    The C++/CLI team did quite some wizardry to allow C++ programmers to use low-level C# features. This is one example. They exploit the C++ syntax to let you do some things a C# programmer isn't even aware of, because they happen somewhere behind the scenes.

    Frankly, I've never been in need to use this special boxing/unboxing trick of C++/CLI, although I'm writing C++/CLI code all day long. However, there are certainly some situations where there's no other way - for instance, when interfacing some weird 3rd-party C# component that cannot be changed, because there's no source code available.