c macros c-preprocessor preprocessor pragma

Why are some preprocessor macros not expanded unless they are arguments to another macro?

In certain situations, some token sequences are not preprocessed fully. For example:

#define EMPTY()
#define DELAY(x) x EMPTY()
#define PRAGMA(args) _Pragma(args)

#define WRAP( BODY ) { BODY }

#define LOOP_Good( body, i, LB, UB ) \
WRAP( \
    DELAY(PRAGMA)("omp parallel for") \
    for( i = LB; i < UB; ++i ){ \
        body \
    } \
)

#define LOOP_Bad( body, i, LB, UB ) \
{ \
    DELAY(PRAGMA)("omp parallel for") \
    for( i = LB; i < UB; ++i ){ \
        body \
    } \
}

#define LOOP_Good_Again( body, i, LB, UB ) \
{ \
    PRAGMA("omp parallel for") \
    for( i = LB; i < UB; ++i ){ \
        body \
    } \
}

// Good
int i;
int lower_i = 0;
int upper_i = 10;

LOOP_Good( printf("%d\n", i);, i, lower_i, upper_i )

// Bad
LOOP_Bad( printf("%d\n", i);, i, lower_i, upper_i )

// Good again
LOOP_Good_Again( printf("%d\n", i);, i, lower_i, upper_i )

Which (using -E -fopenmp gcc 9.1) expands to the below (with formatting):

int i;
int lower_i = 0;
int upper_i = 10;

// Good
{ 
  #pragma omp parallel for
  for( i = lower_i; i < upper_i; ++i ){ 
    printf("%d\n", i); 
  } 
}

// Bad
{ 
  PRAGMA ("omp parallel for") 
  for( i = lower_i; i < upper_i; ++i ){ 
    printf("%d\n", i); 
  } 
}

// Good again
{ 
  #pragma omp parallel for
  for( i = lower_i; i < upper_i; ++i ){ 
    printf("%d\n", i); 
  } 
}

In the 'good' case, the DELAY(PRAGMA) is expanded to PRAGMA which is then expanded (with the adjacent arguments) to _Pragma(...)

In the 'bad' case, the DELAY(PRAGMA) is expanded to PRAGMA but the processing stops and PRAGMA is left in the output. If you take the 'bad' output and repreprocess it (with all the previously defined macros) it correctly expands.

The only difference is that the 'good' case, DELAY(PRAGMA) is part of the argument to the WRAP macro, where as the 'bad' case does not pass DELAY(PRAGMA) into any macro. If in the 'bad' case, we instead use PRAGMA alone, the problem is solved (as in the 'good again' case).

What's the reason for the different behaviors in the 'good' and 'bad' cases?

Solution

In the bad case, what you intend to be arguments to PRAGMA never appear with PRAGMA in tokens that are scanned for macro replacement.

We can ignore the LOOP_xxx macros; they just expand to various tokens without complications, and the resulting tokens are processed as if they appeared in the source file normally. We can instead consider just DELAY(PRAGMA)(foo) and WRAP(DELAY(PRAGMA)(foo).

Per C 2018 6.10.3.1 and 6.10.3.4, the arguments of a macro are processed for macro replacement, then the resulting tokens are substituted into the macro’s replacement tokens, then the resulting tokens and subsequent tokens of the source file are rescanned for further replacement. (When the tokens of a macro argument are being processed, they are treated as if they constitute the entire source file.)

In DELAY(PRAGMA)(foo):

PRAGMA is the argument x to DELAY, but it is not followed by parentheses, so it is not a macro to replace.
PRAGMA is substituted for x in DELAY’s replacement tokens x EMPTY().
The result, PRAGMA EMPTY(), is scanned for replacement.
EMPTY is replaced by nothing.
The results of the replacement of EMPTY, along with the subsequent tokens ((foo), and anything that follows it) are scanned. Note that PRAGMA is not in these tokens: It is not part of the tokens that resulted from replacing EMPTY.
Macro replacement is complete.

In WRAP(PRAGMA)(foo), the first five steps are the same, and the remaining steps result in replacement of PRAGMA (foo):

PRAGMA is the argument x to DELAY, but it is not followed by parentheses, so it is not a macro to replace.
PRAGMA is substituted for x in DELAY’s replacement tokens x EMPTY().
The result, PRAGMA EMPTY(), is scanned for replacement.
EMPTY is replaced by nothing.
The results of the replacement of EMPTY, along with the subsequent tokens ((foo)) are scanned. As above, PRAGMA is not in these tokens, so it is not replaced.
Macro replacement of the argument to WRAP is complete, having produced PRAGMA (foo).
These tokens from the argument are substituted into WRAP’s { BODY }, producing { PRAGMA (foo) }.
These tokens (along with following tokens in the source file) are rescanned for further replacement. Now PRAGMA (foo) appears in these tokens, so it is replaced.