When doing pointer arithmetic with offsetof
, is it well defined behavior to take the address of a struct, add the offset of a member to it, and then dereference that address to get to the underlying member?
Consider the following example:
#include <stddef.h>
#include <stdio.h>
typedef struct {
const char* a;
const char* b;
} A;
int main() {
A test[3] = {
{.a = "Hello", .b = "there."},
{.a = "How are", .b = "you?"},
{.a = "I\'m", .b = "fine."}};
for (size_t i = 0; i < 3; ++i) {
char* ptr = (char*) &test[i];
ptr += offsetof(A, b);
printf("%s\n", *(char**)ptr);
}
}
This should print "there.", "you?" and "fine." on three consecutive lines, which it currently does with both clang and gcc, as you can verify yourself on wandbox. However, I am unsure whether any of these pointer casts and arithmetic violate some rule which would cause the behavior to become undefined.
As far as I can tell, it is well-defined behavior. But only because you access the data through a char
type. If you had used some other pointer type to access the struct, it would have been a "strict aliasing violation".
Strictly speaking, it is not well-defined to access an array out-of-bounds, but it is well-defined to use a character type pointer to grab any byte out of a struct. By using offsetof
you guarantee that this byte is not a padding byte (which could have meant that you would get an indeterminate value).
Note however, that casting away the const
qualifier does result in poorly-defined behavior.
EDIT
Similarly, the cast (char**)ptr
is an invalid pointer conversion - this alone is undefined behavior as it violates strict aliasing. The variable ptr
itself was declared as a char*
, so you can't lie to the compiler and say "hey, this is actually a char**
", because it is not. This is regardless of what ptr
points at.
I believe that the correct code with no poorly-defined behavior would be this:
#include <stddef.h>
#include <stdio.h>
#include <string.h>
typedef struct {
const char* a;
const char* b;
} A;
int main() {
A test[3] = {
{.a = "Hello", .b = "there."},
{.a = "How are", .b = "you?"},
{.a = "I\'m", .b = "fine."}};
for (size_t i = 0; i < 3; ++i) {
const char* ptr = (const char*) &test[i];
ptr += offsetof(A, b);
/* Extract the const char* from the address that ptr points at,
and store it inside ptr itself: */
memmove(&ptr, ptr, sizeof(const char*));
printf("%s\n", ptr);
}
}