Search code examples

is C++ abstraction Endian neutral?

Suppose I have a client and a server that communicate 16 bits numbers with each other via some network protocols, say for example ModbusTCP, but the protocol is not relevant here.

Now I know, that the endian of the client is little (my PC) and the endian of the server is big (some PLC), the client is written entirely in C++ with Boost Asio sockets. With this setup, I thought I had to swap the bytes received from the server to correctly store the number in a uint16_t variable, however this is wrong because I'm reading incorrect values.

My understanding so far is that my C++ abstraction is storing the values into variables correctly without the need for me to actually care about swapping or endianness. Consider this snippet:

// received 0x0201  (513 in big endian)
uint8_t high { 0x02 };  // first byte
uint8_t low { 0x01 };   // second byte
// merge into 16 bit value (no swap)
uint16_t val = (static_cast<uint16_t>(high)<< 8) | (static_cast<uint16_t>(low));
std::cout<<val;   //correctly prints 513

This somewhat surprised me, also because if I look into the memory representation with pointers, I found that they are actually stored in little endian on the client:

// take the address of val, convert it to uint8_t pointer
auto addr = static_cast<uint8_t*>(&val);
// take the first and second bytes and print them 
printf ("%d ", (int)addr[0]);   // print 1
printf ("%d", (int)addr[1]);    // print 2

So the question is:

As long as I don't mess with memory addresses and pointers, C++ can guarantee me that the values I'm reading from the network are correct no matter the endian of the server, correct? Or I'm missing something here?

EDIT: Thanks for the answers, I want to add that I'm currently using boost::asio::write(socket, boost::asio::buffer(data)) to send data from the client to the server and data is a std::vector<uint8_t>. So my understanding is that as long as I fill data in network order I should not care about endianness of my system (or even of the server for 16 bit data), because I'm operating on the "values" and not reading bytes directly from memory, right?

To use htons family of functions I have to change my underlying TCP layer to use memcpy or similar and a uint8_t* data buffer, that is more C-esque rather than C++ish, why should I do it? is there an advantage I'm not seeing?


  • (static_cast<uint16_t>(high)<< 8) | (static_cast<uint16_t>(low)) has the same behaviour regardless of the endianness, the "left" end of a number will always be the most significant bit, endianness only changes whether that bit is in the first or the last byte.

    For example:

    uint16_t input = 0x0201;
    uint8_t leftByte = input >> 8; // same result regardless of endianness
    uint8_t rightByte = input & 0xFF; // same result regardless of endianness
    uint8_t data[2];
    memcpy(data, &input, sizeof(input)); // data will be {0x02, 0x01} or {0x01, 0x02} depending on endianness

    The same applies in the other direction:

    uint8_t data[] = {0x02, 0x01};
    uint16_t output1;
    memcpy(&output1, data, sizeof(output1)); // will be 0x0102 or 0x0201 depending on endianness
    uint16_t output2 = data[1] << 8 | data[0]; // will be 0x0201 regardless of endianness

    To ensure your code works on all platforms its best to use the htons and ntohs family of functions:

    uint16_t input = 0x0201; // input is in host order
    uint16_t networkInput = htons(input);
    uint8_t data[2];
    memcpy(data, &networkInput , sizeof(networkInput));
    // data is big endian or "network" order
    uint16_t networkOutput;
    memcpy(&networkOutput, &data, sizeof(networkOutput));
    uint16_t output = ntohs(networkOutput);  // output is in host order