Search code examples
cheadername-conflict

Is it potentially problematic to #include files after user-defined tokens?


Due to the way a the preprocessor works, an included header file can be affected by the file including the header. For example, you can may break C just by using:

#define printf
#include <stdio.h>
/*  extern int printf(const char *const restrict format, ...);
    Becomes
    extern int (const char *const restrict format, ...);
    And this results in an instant syntax error */

However, I am concerned that any random user-defined identifier could potentially break a header file of a particular implementation in a similar way. But one could expect that library header files never break each other because then the C implementation altogether would be broken and thus non-conforming. As such, I have been careful to always keep library headers at the top:

#include <stdio.h>
#include <somelibrary.h>

#define some_macro
typedef some_type some_other_type;

And not:

#define some_macro
typedef some_type some_other_type;

#include <stdio.h>
#include <somelibrary.h>

Because if either libraries happen to contain the tokens some_macro or some_other_type they could break.

Are these concerns warranted? Or is C required that the number of tokens that can 'break' library headers be limited in some way to only the user-visible ones defined by the header?


Solution

  • I am concerned that any random user-defined identifier ...

    C has the concept of reserved identifiers(bolding mine):

    Each header declares or defines all identifiers listed in its associated subclause, and optionally declares or defines identifiers listed in its associated future library directions subclause and identifiers which are always reserved either for any use or for use as file scope identifiers.

    and

    If the program declares or defines an identifier in a context in which it is reserved (other than as allowed by 7.1.4), or defines a reserved identifier as a macro name, the behavior is undefined.

    The identifiers C reserves are identified by the C standard.

    Additionally, platforms such as POSIX reserve additional identifiers, and Windows does too - some of which can be found here - but there are a lot more identifiers reserved by Windows.

    Note the second quote above, especially the bolded part:

    If the program declares or defines an identifier in a context in which it is reserved (other than as allowed by 7.1.4), or defines a reserved identifier as a macro name, the behavior is undefined.

    Yes, you can and will break your program if you use a reserved identifier.

    The fix: don't do it.

    For example, this is a reserved identifier:

    __SOMETHING.

    It starts with a double underscore, so it's a reserved identifier. Don't do that.

    This is also a reserved identifier under POSIX:

    something_t

    It ends with _t, which is reserved under POSIX. Don't do that. (Lots of people break this one, and sometimes they run into problems later. Don't rely on your _t identifier being collision-safe.)

    At file scope, this is a reserved identifier:

    static int _count;

    It starts with an underscore. That's a reserved identifier at file scope. Don't to that. (Same here - lots of programmers break this rule and think they get away with it - they're just lucky. Don't make luck a necessary condition for your code to work properly...)

    Lots of programmers use reserved identifiers and don't run into collisions, so they think it's OK to do that. But they open themselves up to collisions and breakage in the future for no good reasons.