Search code examples
cincludeextern

Why define "extern_" rather than using "extern"?


I tried to make a simple lexer, I found this github page : https://github.com/DoctorWkt/acwj/tree/master/01_Scanner And in his source code I saw that:

data.h:

...
#ifndef extern_
 #define extern_ extern
#endif

extern_ int Line;
extern_ int Putback;
extern_ FILE *Infile;
...

main.c:

...
#define extern_
#include "data.h"
#undef extern_
...

If I use just the extern keyword that dosn't work but it work with extern_ so what is the difference?


Solution

  • The direct function of the extern_ macro is to control whether the extern keyword appears in variable declarations. In principle, it could also be used to substitute some other keyword or to add qualifiers, but that appears to be incidental.

    At this point, it is important to remark that the code presented in the question is not representative of the code in the referenced GitHub project. On GitHub, the variable declarations appear in the header, and not in main.c. That's directly relevant to why such a facility is useful.

    In particular then, consider the difference between the two alternatives exercised in the overall project:

    • Most C source files in the project #include the header without defining macro extern_. In those cases, the header itself defines the macro extern_ to expand to the keyword extern, resulting in these declarations in those translation units:

       extern int Line;
       extern int Putback;
       extern FILE *Infile;
      
    • File main.c is special. It defines macro extern_ to expand to nothing, and it includes the header within the scope of that definition. The header then relies on the provided macro definition, so that in that translation unit alone, the resulting declarations are

        int Line;
        int Putback;
        FILE *Infile;
      

    The difference between the former and the latter is that the former are pure declarations, whereas the latter are tentative definitions. This is important because each object with external linkage that is accessed by the program must be defined in exactly one translation unit. A translation unit containing a tentative definition of a given object definitely contains a definition of that object (which is a bit more complicated than it may sound).

    Overall, then, the effect is that the same header can be used in two different roles: on one hand, by default, to declare the identifiers of external objects so that they can be accessed from other translation units, and on the other hand to define them in one selected translation unit, so that they actually exist in the program.

    If I use just the extern keyword that dosn't work but it work with extern_ so what is the difference?

    Declaring a variable extern without providing an initializer constitutes a promise that the variable is defined somewhere in the program, but does not itself cause the variable to be defined. If a given variable is not declared any other way, anywhere in the program, then all those promises are unfulfilled, and the resulting behavior is undefined. Typically, that will manifest in the form of a link failure.

    As long as I mention initializers, it seems prudent to note that writing initializers into the declarations in the header is not a viable solution, for then every translation unit that included the header would have definitions of all the variables, whereas there must not be more than one definition of each in the entire program. The behavior would again be undefined. There is a better chance in practice that the program would be accepted, but it would still be wrong.

    Finally, I note that this whole business with the extern_ macro is somewhat of a hack, and should not be regarded as conventional. The canonical way to do this is for the header simply to declare all the variables extern, and for there to be a separate definition of each in a chosen C source file (not necessarily all in the same file). Example:


    data.h

    extern int Line;
    extern int Putback;
    extern FILE *Infile;
    

    main.c

    #include "data.h"
    
    int Line /* optionally with an initializer here */;
    int Putback  /* optionally with an initializer here */;
    FILE *Infile  /* optionally with an initializer here */;
    
    // ...
    

    other.c

    #include "data.h"
    
    // no (additional) declarations or definitions of the variables declared in data.h