Search code examples
c++ccompiler-constructionc-preprocessor

Could C++ or C99 theoretically be compiled to equally-portable C90?


This is a big question, so let me get a few things out of the way:

  1. Let's ignore the fact that some C++ features cannot be implemented in C (for example, supporting pre-main initialization for any global static object that is linked in).
  2. This is a thought experiment about what is theoretically possible. Please do not write to say how hard this would be (I know), or that I should do X instead. It's not a practical question, it's a fun theoretical one. :)

The question is: is it theoretically possible to compile C++ or C99 to C89 that is as portable as the original source code?

Cfront and Comeau C/C++ do compile C++ to C already. But for Comeau the C they produce is not portable, according to Comeau's sales staff. I have not used the Comeau compiler myself, but I speculate that the reasons for this are:

  1. Macros such as INT_MAX, offsetof(), etc. have already been expanded, and their expansion is platform-specific.
  2. Conditional compilation such as #ifdef has already been resolved.

My question is whether these problems could possibly be surmounted in a robust way. In other words, could a perfect C++ to C compiler be written (modulo the unsupportable C++ features)?

The trick is that you have to expand macros enough to do a robust parse, but then fold them back into their unexpanded forms (so they are again portable and platform-independent). But are there cases where this is fundamentally impossible?

It would be very difficult for anyone to categorically say "yes, this is possible" but I'm very interested in seeing any specific counterexamples: code snippets that could not be compiled in this way for some deep reason. I'm interested in both C++ and C99 counterexamples.

I'll start out with a rough example just to give a flavor of what I think a counterexample might look like.

#ifdef __SSE__
#define OP <
#else
#define OP >
#endif

class Foo {
 public:
  bool operator <(const Foo& other) { return true; }
  bool operator >(const Foo& other) { return false; }
};

bool f() { return Foo() OP Foo(); }

This is tricky because the value of OP and therefore the method call that is generated here is platform-specific. But it seems like it would be possible for the compiler to recognize that the statement's parse tree is dependent on a macro's value, and expand the possibilities of the macro into something like:

bool f() {
#if __SSE__
   return Foo_operator_lessthan(...);
#else
   return Foo_operator_greaterthan(...);
#endif
}

Solution

  • It is not only theoretically possible, but also practically trivial - use LLVM with a cbe target.