Search code examples
clanguage-lawyerstrict-aliasing

Does this code violate the strict aliasing rule?


Questions:

  1. Does this code below violate strict aliasing rules? That is, would a smart compiler be allowed to print 00000 (or some other nasty effect), because a buffer first accessed as other type is then accessed via int*?

  2. If not, would moving just the definition and initializaton of ptr2 before the braces (so ptr2 would be defined already, when ptr1 comes to scope) break it?

  3. If not, would removing the braces (so ptr1 and ptr2 were in the same scope) break it?

  4. If yes, how could the code be fixed?

Bonus question: If the code is ok, and 2. or 3. don't break it either, how to change it so it would break strict aliasing rules (example, convert braced loop to use int16_t)?


int i;
void *buf = calloc(5, sizeof(int)); // buf initialized to 0

{
    char *ptr1 = buf;    
    for(i = 0; i < 5*sizeof(int); ++i)
        ptr1[i] = i;
}

int *ptr2 = buf;
for(i = 0; i < 5; ++i)
    printf("%d", ptr2[i]);

Looking for confirmation, so short(ish), expert answer about this particular code, ideally with minimal standard quotes, is what I am after. I am not after long explanations of strict aliasing rules, only the parts that pertain to this code. And it would be great if an answer would explicitly enumerate the numbered questions above.

Also assume a general-purpose CPU with no integer trap values, and let's also say int is 32 bits and two's complement.


Solution

  • No it doesn't, but this is only because the memory was allocated, and written into using a character type.

    Memory is allocated using malloc. That object doesn't have declared1 type because it was allocated with malloc. Thus the object doesn't have any effective type.

    Then the code accesses and modifies the object using the type char. As the type is2 char and no object having an effective type is copied5, copying doesn't set the effective type to char for this and subsequent accesses, but sets the effective type to char, only for the duration of the access3. After the access, the object doesn't have an effective type anymore.

    Then the type int is used to access and only read that object. As the object doesn't have an effective type, it becomes3 int, for the duration of the read. After the access the object doesn't have an effective type anymore. As int was obviously compatible with the effective type int, the behavior is defined.

    (Assuming the values read are not trap representation for int.)


    Had you accessed and modified the object using a non-character type that is also not compatible with int, the behavior would be undefined.

    Let's say your example was (assuming sizeof(float)==sizeof(int)):

    int i;
    void *buf = calloc(5, sizeof(float)); // buf initialized to 0
    
    {
        float *ptr1 = buf;    
        for(i = 0; i < 5*sizeof(float); ++i)
            ptr1[i] = (float)i;
    }
    
    int *ptr2 = buf;
    for(i = 0; i < 5; ++i)
        printf("%d", ptr2[i]);
    

    The effective type of the object, when floats are being written into, becomes of type float, for the duration of the write and all subsequent accesses to the object that don't modify it2. When those objects are then accessed by int the effective type remains float, as the values are only being read not modified. The previous write using float set the effective type to float permanently until the next write into this object (which didn't happen in this case). Types int and float are not compatible4, thus the behavior is undefined.


    (All text below is quoted from: ISO:IEC 9899:201x)

    1 (6.5 Expressions 6)
    The effective type of an object for an access to its stored value is the declared type of the object, if any. 87) Allocated objects have no declared type.

    2 (6.5 Expressions 6)
    If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value.

    3 (6.5 Expressions 6)
    For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

    4 (6.5 Expressions 8)
    An object shall have its stored value accessed only by an lvalue expression that has one of the following types: 88) — a type compatible with the effective type of the object, — a qualified version of a type compatible with the effective type of the object, — a type that is the signed or unsigned type corresponding to the effective type of the object, — a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object, — an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or — a character type.

    5 (6.5 Expressions 6)
    If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one.