Search code examples
c++c++11structunionstype-punning

Inconsistent results when type punning uint64_t with union and bit-field


I am using an anonymous struct in union as follows:

using time64_t = uint64_t;
using bucket_t = uint64_t;

union clock_test {
    time64_t _time64;

    struct {
        bucket_t _bucket5 : 10;     // bucket:5  1024
        bucket_t _bucket4 : 8;      // bucket:4  256
        bucket_t _bucket3 : 6;      // bucket:3  64
        bucket_t _bucket2 : 6;      // bucket:2  64
        bucket_t _bucket1 : 6;      // bucket:1  64
        bucket_t _bucket0 : 6;      // bucket:0  64
    };
};

If bucket_t = uint64_t, it works as expected, but with using bucket_t = uint16_t or uint32_t, I get puzzling results.

I use the same test code for all cases:

clock_test clk;
clk._time64 = 168839113046;

For bucket_t = uint64_t, clk is:

_bucket5   342  // unsigned __int64
_bucket4    26  // unsigned __int64
_bucket3    38  // unsigned __int64
_bucket2    15  // unsigned __int64
_bucket1    29  // unsigned __int64
_bucket0     2  // unsigned __int64

For bucket_t = uint32_t, clk is:

_bucket    342  // unsigned int
_bucket4    26  // unsigned int
_bucket3    38  // unsigned int
_bucket2    15  // unsigned int
_bucket1    39  // unsigned int
_bucket0     0  // unsigned int

For bucket_t = uint16_t, clk is:

_bucket5    342 // unsigned short
_bucket4    152 // unsigned short
_bucket3     15 // unsigned short
_bucket2     39 // unsigned short
_bucket1      0 // unsigned short
_bucket0      0 // unsigned short

...
use vscode + clang, See this issue clearly enter image description here


Solution

  • The reason why you get inconsistent results is that bit-field members are normally not packed. The type of the member matters, and may impact padding:

    // 168839113046 in binary
    // type punned with bucket_t = unsigned short (assuming 16-bit)
    00100111 01001111 10011000 01101001 01010110
      |    |   |    | |      |       |         |
      |    |   |    | |      |       01 01010110 // _bucket5 = 342
      |    |   |    | |      | ######            // padding to 16-bit bounds
      |    |   |    | 10011000                   // _bucket4 = 152
      |    |   001111                            // _bucket3 = 15
      |    | ##                                  // padding to 16-bit bounds
      100111                                     // _bucket2 = 39
    ...                                          // _bucket1 = 0
                                                 // _bucket0 = 0
    

    Whenever the compiler can't fit another bit-field member into the same 16-bit object, it inserts padding and puts it into the next one. This changes the values you read, because you're reading bits at different positions. If your bit-field members all had a 64-bit type, this wouldn't happen.

    Non-Standard and Undefined Behavior

    That being said, your code is just not valid C++.

    1. anonymous structs are not standard C++; they only work because of a GCC compiler extension
    2. the layout and alignment of bit-field members is completely implementation-defined, so you might not get the same results with different compilers
    3. using union for type punning like this is undefined behavior; you can only access the active member of the union, with some exceptions

    To get consistent results, use shifts and masks:

    time64_t data = 168839113046;
    (data >>  0) & ((1u << 10) - 1)     // = 342
    (data >> 10) & ((1u <<  8) - 1)     // = 26
    (data >> 18) & ((1u <<  6) - 1)     // = 38
    // ...
    

    This will give you consistent results everywhere. It works by shifting the data to the right, and then using the bitwise AND operator to mask out the lowest N bits.