Search code examples
ctype-conversionundefined-behaviortype-punning

Aware of issues during casting a struct of a sized pointer to its const-equivalent


I'm afraid about portability issues, data alignment problems or Undefined Behavior in the following case of casting between to structures that has identical members but one made Data constant.

In the code below, I'm refering to this cast: (SizedConstPtr*) &SRC

/* writable Data type */
typedef struct
{
  unsigned long Size;
  unsigned char * Data;
} SizedPtr;

/* read-only Data type */
typedef struct
{
        unsigned long   Size;
  const unsigned char * Data;
} SizedConstPtr;

void data_copy ( SizedPtr dst, SizedConstPtr src );

extern SizedPtr SRC;
extern SizedPtr DST;

void do_copy ( void )
{
  data_copy ( DST, *(SizedConstPtr*) &SRC );
}

If I do compile the file with GCC flags:

-Wall -Werror -Wpedantic -Wextra

I don't get any warning but I'm still not sure, or even how can I study this by my self looking in the Standard of C.


Solution

  • Accessing SRC via SizedConstPtr is not defined by the C standard

    The behavior of *(SizedConstPtr*) &SRC is not defined by the C standard because it does not conform to the aliasing rules in C 2018 6.5 7 (original bullets changed to numbers for reference):

    An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

    1. a type compatible with the effective type of the object,

    2. a qualified version of a type compatible with the effective type of the object,

    3. a type that is the signed or unsigned type corresponding to the effective type of the object,

    4. a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,

    5. an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or

    6. a character type.

    The type used to access SRC in *(SizedConstPtr*) &SRC is SizedConstPtr, and the effective type of SRC is its declared type, SizedPtr. We use these to compare the cases listed in 6.5 7:

    1. SizedConstPtr is not compatible with SizedPtr because there is no rule in the C standard that makes two structure types completed in the same translation unit compatible. Each complete definition of a structure makes a new type. Even typedef struct { float x, y; } Point; and typedef struct { float x, y; } Complex; are different types even though their definitions are identical.

    2. SizedConstPtr is not a qualified version of SizedPtr or any type compatible with it because it is not qualified at all. (It contains members with some qualifiers, but it itself is not qualified.)

    3. SizedConstPtr is not a signed or unsigned type.

    4. SizedConstPtr is not a signed or unsigned type.

    5. SizedConstPtr is an aggregate type but does not contain SizedPtr among its members.

    6. SizedConstPtr is not a character type.

    Fix

    Instead of data_copy ( DST, *(SizedConstPtr*) &SRC );, we can use this code:

    SizedConstPtr temporary;
    memcpy(&temporary, &SRC, sizeof temporary);
    data_copy(DST, temporary);
    

    The memcpy routine effectively accesses its source using character accesses, and this conforms to the aliasing rules due to the sixth bullet item, a character type.

    This code relies on copying the bytes from one type into another type, to be reinterpreted in the new type. For this to work, we need to ensure the memory layout for SizedConstPtr and SizedPtr are the same.

    The only difference between the two structures is that one has an unsigned char * where the other has a const unsigned char *. These two types have the same representation because C 2018 6.2.5 28 says:

    … Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements…

    So we know the members of the structures individually have the same representations in memory. That leaves open the possibility the structures are laid out different in memory, that is, that the compiler uses different amounts of padding in them. There is no reason to do this, as the alignment requirements do not differ between the structures, but the C standard would permit it. That they have the same layout can be assured using static assertions:

    #include <stddef.h>
    _Static_assert(sizeof (SizedPtr) == sizeof (SizedConstPtr), "SizedPtr and SizedConstPtr must have the same size");
    _Static_assert(offsetof(SizedPtr, Data) == offsetof(SizedConstPtr, Data), "SizedPtr and SizedConstPtr must have the same layout");