Search code examples
cmemorystructunions

How does C interpret data from a union if it's formatted differently?


main()
{
union{
    char i[2];
    struct{
        short age;
    } myStruct;
} myUnion;
myUnion.i[0] = 'A';
myUnion.i[1] = 'B';
printf("%x ", myUnion.myStruct.age);

} 

So I understand that the union only contains the space for the largest member inside it - in this case, the char array "i" and the struct "myStruct" seem to be the same, so the union would only have two bytes containing characters 'A' and 'B'. However, what would happen if you tried to read the struct member "age" at that point?


Solution

  • It used to be, in days past, that this was "undefined behavior" and could theoretically crash your system or worse. However, programmers did it anyway, and it was codified in C99 (see Is type-punning through a union unspecified in C99, and has it become specified in C11?), which allows you to do it but doesn't say what the results will be or whether they make sense at all.

    So,

    • On modern 8-bit-byte 16-bit-short little-endian systems it will print 4241,

    • On modern 8-bit-byte 16-bit-short big-endian systems it will print 4142,

    • If sizeof(short) > 2 then you have a problem, because age is uninitialized (but these systems are very rare),

    • You will get different results on EBCDIC (which you don't use or care about),

    • You will get different results on non-8-bit-byte systems (which you don't use or care about),

    • You could invoke undefined behavior if your program creates a trap representation for a short... however, modern systems do not have trap representations for integers.