Search code examples
c++.netc++-cliclrmanaged-code

Why does C++ need language modifications to be "managed"?


Why can't a compiler be written that manages what needs to be managed in C++ code (i.e. to make it "CLR compatible")?

Maybe with some compromise, like prohibiting void pointers in some situations etc. But all these extra keywords etc. What's the problem that has to be solved by these additions?

I have my thoughts about some aspects and what might be hard to solve, but a good solid explanation would be highly appreciated!


Solution

  • I'd have to disagree with the answers so far.

    The main problem to understand is that a C++ compiler creates code which is suitable for a very dumb environment. Even a modern CPU does not know about virtual functions, hell, even functions are a stretch. A CPU really doesn't care that exception handling code to unwind the stack is outside any function, for instance. CPU's deal in instruction sequences, with jumps and returns. Functions certainly do not have names as far as the CPU is concerned.

    Hence, everything that's needed to support the concept of a function is put there by the compiler. E.g. vtables are just arrays of the right size, with the right values from the CPUs viewpoint. __func__ ends up as a sequence of bytes in the string table, the last one of which is 00.

    Now, there's nothing that says the target environment has to be dumb. You could definitely target the JVM. Again, the compiler has to fill in what's not natively offered. No raw memory? Then allocate a big byte array and use it instead. No raw pointers? Just use integer indices into that big byte array.

    The main problem is that the C++ program looks quite unrecognizable from the hosting environment. The JVM isn't dumb, it knows about functions, but it expects them to be class members. It doesn't expect them to have < and > in their name. You can circumvent this, but what you end up with is basically name mangling. And unlike name mangling today, this kind of name mangling isn't intended for C linkers but for smart environments. So, its reflection engine may become convinced that there is a class c__plus__plus with member function __namespace_std__for_each__arguments_int_pointer_int_pointer_function_address, and that's still a nice example. I don't want to know what happens if you have a std::map of strings to reverse iterators.

    The other way around is actually a lot easier, in general. Pretty much all abstractions of other languages can be massaged away in C++. Garbage collection? That's already allowed in C++ today, so you could support that even for void*.

    One thing I didn't address yet is performance. Emulating raw memory in a big byte array? That's not going to be fast, especially if you put doubles in them. You can play a whole lot of tricks to make it faster, but at which price? You're probably not going to get a commercially viable product. In fact, you might up with a language that combines the worst parts of C++ (lots of unusual implementation-dependent behavior) with the worst parts of a VM (slow).