To show the topic I'm going to use C, but the same macro can be used also in C++ (with or without struct
), raising the same question.
I came up with this macro
#define STR_MEMBER(S,X) (((struct S*)NULL)->X, #X)
Its purpose is to have strings (const char*
) of an existing member of a struct
, so that if the member doesn't exist, the compilation fails. A minimal usage example:
#include <stdio.h>
struct a
{
int value;
};
int main(void)
{
printf("a.%s member really exists\n", STR_MEMBER(a, value));
return 0;
}
If value
weren't a member of struct a
, the code wouldn't compile, and this is what I wanted.
The comma operator should evaluate the left operand and then discard the result of the expression (if there is one), so that my understanding is that usually this operator is used when the evaluation of the left operand has side effects.
In this case, however, there aren't (intended) side effects, but of course it works iff the compiler doesn't actually produce the code which evaluates the expression, for otherwise it would access to a struct
located at NULL
and a segmentation fault would occur.
Gcc/g++ 6.3 and 4.9.2 never produced that dangerous code, even with -O0
, as if they were always able to “see” that the evaluation hasn't side effects and so it can be skipped.
Adding volatile
in the macro (e.g. because accessing that memory address is the desired side effect) was so far the only way to trigger the segmentation fault.
So the question: is there anything in the C and C++ languages standard which guarantees that compilers will always avoid actual evaluation of the left operand of the comma operator when the compiler can be sure that the evaluation hasn't side effects?
I am not asking for a judgment about the macro as it is and the opportunity to use it or make it better. For the purpose of this question, the macro is bad if and only if it evokes undefined behaviour — i.e., if and only if it is risky because compilers are allowed to generate the “evaluation code” even when this hasn't side effects.
I have already two obvious fixes in mind: “reifying” the struct
and using offsetof
. The former needs an accessible memory area as big as the biggest struct
we use as first argument of STR_MEMBER
(e.g. maybe a static union could do…). The latter should work flawlessly: it gives an offset we aren't interested in and avoids the access problem — indeed I'm assuming gcc, because it's the compiler I use (hence the tag), and that its offsetof
built-in behaves.
With the offsetof
fix the macro becomes
#define STR_MEMBER(S,X) (offsetof(struct S,X), #X)
Writing volatile struct S
instead of struct S
doesn't cause the segfault.
Suggestions about other possible “fixes” are welcome, too.
Actually, the real usage case was in C++ in a static storage struct
. This seems to be fine in C++, but as soon as I tried C with a code closer to the original instead of the one boiled for this question, I realized that C isn't happy at all with that:
error: initializer element is not constant
C wants the struct to be initializable at compile time, instead C++ it's fine with that.
The comma operator (C documentation, says something very similar) has no such guarantees.
In a comma expression
E1, E2
, the expression E1 is evaluated, its result is discarded ..., and its side effects are completed before evaluation of the expression E2 begins
irrelevant information omitted
To put it simply, E1
will be evaluated, although the compiler might optimize it away by the as-if rule if it is able to determine that there are no side-effects.