Search code examples
clong-integermemcpy

Copy 6 byte array to long long integer variable


I have read from memory a 6 byte unsigned char array. The endianess is Big Endian here. Now I want to assign the value that is stored in the array to an integer variable. I assume this has to be long long since it must contain up to 6 bytes.

At the moment I am assigning it this way:

unsigned char aFoo[6];
long long nBar;
// read values to aFoo[]...
// aFoo[0]: 0x00
// aFoo[1]: 0x00
// aFoo[2]: 0x00
// aFoo[3]: 0x00
// aFoo[4]: 0x26
// aFoo[5]: 0x8e
nBar = (aFoo[0] << 64) + (aFoo[1] << 32) +(aFoo[2] << 24) + (aFoo[3] << 16) + (aFoo[4] << 8) + (aFoo[5]);

A memcpy approach would be neat, but when I do this

memcpy(&nBar, &aFoo, 6);

the 6 bytes are being copied to the long long from the start and thus have padding zeros at the end. Is there a better way than my assignment with the shifting?


Solution

  • What you want to accomplish is called de-serialisation or de-marshalling.

    For values that wide, using a loop is a good idea, unless you really need the max. speed and your compiler does not vectorise loops:

    uint8_t array[6];
    ...
    uint64_t value = 0;
    
    uint8_t *p = array;
    for ( int i = (sizeof(array) - 1) * 8 ; i >= 0 ; i -= 8 )
        value |= (uint64_t)*p++ << i;
    

    // left-align value <<= 64 - (sizeof(array) * 8);

    Note using stdint.h types and sizeof(uint8_t) cannot differ from1`. Only these are guaranteed to have the expected bit-widths. Also use unsigned integers when shifting values. Right shifting certain values is implementation defined, while left shifting invokes undefined behaviour.

    Iff you need a signed value, just

    int64_t final_value = (int64_t)value;
    

    after the shifting. This is still implementation defined, but all modern implementations (and likely the older) just copy the value without modifications. A modern compiler likely will optimize this, so there is no penalty.

    The declarations can be moved, of course. I just put them before where they are used for completeness.