Search code examples
cmemoryendiannessmemcpy

Endian-independent way of using memcpy() from smaller to larger integer pointer


Suppose I have two arrays.

uint8_t[SIZE] src = { 0 };
uint32_t[SIZE] dst = { 0 };

uint8_t* srcPtr;  // Points to current src value
uint32_t* dstPtr; // Points to current dst value

src holds values that sometimes need to be put into dst. Importantly, the values from src may be 8-bit, 16-bit, or 32-bit, and aren't necessarily properly aligned. So, suppose I wish to use memcpy() like below, to copy a 16-bit value

memcpy(dstPtr, srcPtr, 2);

Will I run into an endianness issue here? This works fine on little-endian systems, since if I want to copy 8, then srcPtr has 08 then 00 the bytes at dstPtr will be 08 00 00 00 and the value will be 8, as expected.

But if I were on a big-endian system, srcPtr would be 00 then 08, and the bytes at dstPtr will be 00 08 00 00 (I presume), which would take on a value of 524288.

What would be an endian-independent way to write this copy?


Solution

  • Will I run into an endianness issue here?

    Not necessarily endianness issues per se, but yes, the specific approach you describe will run into issues with integer representation.

    This works fine on little-endian systems, since if I want to copy 8, then srcPtr has 08 then 00 the bytes at dstPtr will be 08 00 00 00 and the value will be 8, as expected.

    You seem to be making an assumption there, either

    • that more bytes of the destination will be modified than you actually copy, or perhaps
    • that relevant parts of the destination are pre-set to all zero bytes.

    But you need to understand that memcpy() will copy exactly the number of bytes requested. No more than that will be read from the specified source, and no more than that will be modified in the destination. In particular, the data types of the objects to which the source and destination pointers point have no effect on the operation of memcpy().

    What would be an endian-independent way to write this copy?

    The most natural way to do it would be via simple assignment, relying on the compiler to perform the necessary conversion:

    *dstPtr = *srcPtr;
    

    However, I take your emphasis on the prospect that the arrays might not aligned as a concern that it may be unsafe to dereference the source and / or destination pointer. That will not, in fact, be the case for pointers to char, but it might be the case for pointers to other types. For cases where you take memcpy as the only safe way to read from the arrays, the most portable method for converting value representations is still to rely on the implementation. For example:

    uint8_t* srcPtr = /* ... */;
    uint32_t* dstPtr = /* ... */;
    
    uint16_t srcVal;
    uint32_t dstVal;
    
    memcpy(&srcVal, srcPtr, sizeof(srcVal));
    dstVal = srcVal;  // conversion is automatically performed
    memcpy(dstPtr, &dstVal, sizeof(dstVal));