Search code examples
c++c++11printfc99code-translation

C99 printf formatters vs C++11 user-defined-literals


This code:

#define __STDC_FORMAT_MACROS
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
int main(int argc,char **argv)
{
   uint64_t val=1234567890;
   printf("%"PRId64"\n",val);
   exit(0);
}

Works for C99, C++03, C++11 according to GCC 4.5, but fails on C++11 according to GCC 4.7.1. Adding a space before PRId64 lets GCC 4.7.1 compile it.

Which one is correct?


Solution

  • gcc 4.7.1 is correct. According to the standard,

    2.2 Phases of translation [lex.phases]

    1 - The precedence among the syntax rules of translation is specified by the following phases. [...]
    3. The source file is decomposed into preprocessing tokens (2.5) and sequences of white-space characters (including comments). [...]
    4. Preprocessing directives are executed, macro invocations are expanded, [...]

    And per 2.5 Preprocessing tokens [lex.pptoken], user-defined-string-literal is a preprocessing token production:

    2.14.8 User-defined literals [lex.ext]

    user-defined-string-literal:
        string-literal ud-suffix
    ud-suffix:
        identifier

    So the phase-4 macro expansion of PRId64 is irrelevant, because "%"PRId64 has already been parsed as a single user-defined-string-literal preprocessing token consisting of string-literal "%" and ud-suffix PRId64.

    Oh, this is going to be awesome; everyone will have to change

    printf("%"PRId64"\n", val);
    

    to

    printf("%" PRId64"\n", val);     // note extra space
    

    However! gcc and clang have agreed to treat user-defined string literals without a leading underscore on the suffix as two separate tokens (per the non well formedness criterion), see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52538 so for future versions of gcc (4.8 branch, I think) existing code will work again.