Counter-intuitive effect of reinterpret_cast<T>

I have the following code :

std::vector<short> vec{ 0, 2, 0, 4 };
int* lpvec = reinterpret_cast<int*>(&vec[0]);

(Compiled under VC12: short 2 bytes, int 4 bytes) I think it will produce:

lpvec[0] = 2, 
lpvec[1] = 4

but it surprises me and outputs

lpvec[0] = 2 * 2^16 + 0 = 131072, 
lpvec[1] = 4 * 2^16 + 0 = 262144

I said counter-intuitive, because I think vector of shorts lay out in memory in the following pattern:

+---------+---------+---------+---------+
| 2 bytes | 2 bytes | 2 bytes | 2 bytes |
+---------+---------+---------+---------+
|       0 |       2 |       0 |       4 |
+---------+---------+---------+---------+

so, int will look the same, but occupies twice as much space:

+-------------+------------+
|   4 bytes   |  4 bytes   |
+-------------+------------+
| 0*2^16 + 2  | 0*2^16 + 4 |
+-------------+------------+

Would anyone enlighten my why am I wrong?

Solution

Oh noes, that's not cool...

What you are doing is invoking undefined behavior.

Casting an short* to int* would violate the aliasing rules, but primarily the "unexpected" result is due to the implementation defined endianness of integral values.

Little- vs Big-endian

"Endianness" is the order of the bytes (representing the value) stored inside an integral type, in this case an int.

It seems that your platform is using little-endian; meaning that the least significant byte is stored first, while your expected result depends on the implementation using big-endian; which, as stated, isn't the case.

Your implementation stores short { 2 } as [0x02][0x00], which would make the int pointed to by lpvec equivalent to [0x00][0x00][0x02][0x00].

The calculations involved, since your platform uses big-endian would be:

(2^0 * 0) + (2^8 * 0) + (2^16 * 2) + (2^24 * 0) = 131072

^{Note: the above assumes that a byte is 8 bits wide, something which is also implementation defined.}