Search code examples
clanguage-lawyerc11gcc4.9

Is memcpy(&a + 1, &b + 1, 0) defined in C11?


This question follows this previous question about the definedness of memcpy(0, 0, 0), which has been conclusively determined to be undefined behavior.

As the linked question shows, the answer hinges on the contents of C11's clause 7.1.4:1

Each of the following statements applies unless explicitly stated otherwise in the detailed descriptions that follow: If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer, […]) […] the behavior is undefined. […]

The standard function memcpy() expects pointers to void and const void, as so:

void *memcpy(void * restrict s1, const void * restrict s2, size_t n);

The question is worth asking at at all only because there are two notions of “valid” pointers in the standard: there are the pointers that can validly be obtained through pointer arithmetics and can validly be compared with <, > to other pointers inside the same object. And there are pointers that are valid for dereferencing. The former class includes “one-past” pointers such as &a + 1 and &b + 1 in the following snippet, whereas the latter class does not include these as valid.

char a;
const char b = '7';
memcpy(&a + 1, &b + 1, 0);

Should the above snippet be considered defined behavior, in light of the fact that the arguments of memcpy() are typed as pointers to void anyway, so the question of their respective validities cannot be about dereferencing them. Or should &a + 1 and &b + 1 be considered “outside the address space of the program”?

This matters to me because I am in the process of formalizing the effects of standard C functions. I had written one pre-condition of memcpy() as requires \valid(s1+(0 .. n-1));,until it was pointed to my attention that GCC 4.9 had started to aggressively optimize such library function calls beyond what is expressed in the formula above (indeed). The formula \valid(s1+(0 .. n-1)) in this particular specification language is equivalent to true when n is 0, and does not capture the undefined behavior that GCC 4.9 relies on to optimize.


Solution

  • C11 says:

    (C11, 7.24.2.1p2) "The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1."

    &a + 1 itself is a valid pointer to integer addition but &a + 1 is not a pointer to an object, so the call invokes undefined behavior.