Search code examples
gccclangc-preprocessorpreprocessor

'cpp' v.s. 'clang' Preprocessing Behavior


a.txt

a text
#include "b.txt"

b.txt

b text

If we pre-process the above files using cpp -P a.txt, we get the following output in the console:

a text
b text

However if we attempt to pre-process using clang -P a.txt, we receive the following error:

ld: unknown file type in '/Users/myUser/a.txt'
clang: error: linker command failed with exit code 1 (use -v to see invocation)

What is the difference between the pre-processing behavior of clang and cpp? Particularly, what is their difference in this use case of .txt files? Seemingly related thread, as well as another specified to MacOS.


Solution

  • cpp is a preprocessor. clang is a compiler that may invoke a preprocessor before compiling its input files. You can't just use them with the same command-line options.

    clang is similar to gcc. Both take a -E option to invoke just the preprocessing phase -- and both assume that an input file whose name ends with .txt is intended for the linker.

    The following works on my system. The - argument tells the command to read from stdin, so it doesn't see the .txt suffix.

    (Or you can use -xc or -x c to tell the compiler to assume that the input file is C source, regardless of the file name. Thanks to
    KamilCuk for reminding me of this.)

    $ cat a.txt
    a text
    #include "b.txt"
    $ cat b.txt 
    b text
    $ gcc -P -E - < a.txt 
    a text
    b text
    $ clang -P -E - < a.txt
    a text
    b text
    
    $  
    

    The preprocessor used by clang adds an extra blank line. Since the output is intended to be compiled, usually as C or C++, this usually doesn't matter.

    The C preprocessor isn't really a general-purpose text processor. For example, it splits its input into preprocessing tokens, so a lone apostrophe is likely to cause an error as the preprocessor treats it as an incomplete character constant.