I was testing a simple compiler when I noticed that its output was completely wrong. In fact, the output had its endianness swapped from little to big. Upon closer examination, the offending code turned out to be this:
const char *bp = reinterpret_cast<const char*>(&command._instruction);
for (int i = 0; i < 4; ++i)
out << bp[i];
A four-byte instruction is reinterpreted as a set of one-byte characters and printed to stdout (it's clunky, yes, but that decision was not mine). It doesn't seem logical to me why the bits would be swapped, since the char pointer should be pointing to the most-significant (on this x86 system) bits at first. For example, given 0x00...04, the char pointer should point to 0x00, not 0x04. The case is the latter.
I have created a simple demonstration of code:
CODE
#include <bitset>
#include <iostream>
#include <stdint.h>
int main()
{
int32_t foo = 4;
int8_t* cursor = reinterpret_cast<int8_t*>(&foo);
std::cout << "Using a moving 8-bit pointer:" << std::endl;
for (int i = 0; i < 4; ++i)
std::cout << std::bitset<8>(cursor[i]) << " "; // <-- why?
std::cout << std::endl << "Using original 4-byte int:" << std::endl;
std::cout << std::bitset<32>(foo) << std::endl;
return 0;
}
Output:
Using a moving 8-bit pointer:
00000100 00000000 00000000 00000000
Using original 4-byte int:
00000000000000000000000000000100
It doesn't seem logical to me why the bits would be swapped, since the char pointer should be pointing to the most-significant (on this x86 system) bits at first.
On an x86 system, a pointer to the base of a multi-byte object does not point at the most significant byte, but at the least-significant byte. This is called "little endian" byte order.
In C, if we take the address of an object that occupies multiple bytes, and convert that to char *
, it points to the base of the object: that one which is considered to be at the least significant address, from which the pointer can be positively displaced (with +
or ++
etc) to get to the other bytes.