Search code examples
ccompiler-constructionprintflexical-analysis

How does the C compiler parse the following C statement?


Consider the following lines:

int i;
printf("%d",i);

Will the lexical analyzer go into the string to parse % and d as separate tokens, or will it parse "%d" as one token?


Solution

  • There are two parsers at work here: first, the C compiler, that will parse the C file and basically ignore the content of the string (though modern C compilers will parse the string as well to help catch bad format strings — mismatches between the % conversion specifier and the corresponding argument passed to printf() to be converted).

    The next parser is the string format parser built into the C runtime library. This will be called at runtime to parse the format string when you call printf. This parser is of course very simple in comparison.

    I have not checked, but I would guess that the C compilers that help checking for bad format strings will implement a printf-like parser as a post-processing step (i.e. using its own lexer).