Search code examples
cbit-manipulationbitwise-operatorsbit-shift

Left Bit shift and casting


I have a behaviour that i don't understand, i try to construct an 64 integer from an array of bytes from big endian to little endian.

uint64_t u;
uint8_t bytes[2];


bytes[1] = 0xFF;
u =  bytes[1] << 24 ;
dump_bytes_as_hex( &u, 8 );

00 00 00 FF FF FF FF FF

u =  ( (uint16_t) bytes[1]) << 24 ;
dump_bytes_as_hex( &u, 8 );

00 00 00 FF FF FF FF FF

u =  ( (uint32_t) bytes[1]) << 24 ;
dump_bytes_as_hex( &u, 8 );

00 00 00 FF 00 00 00 00

I don't understand why it give me the correct result only if i cast to a type that has more bits than the shift size. I have tried different values :

  • 0xFF-1 give the same bad result
  • 100 give correct result without casting

So i wanted to know what is the rule ? and why 100 give me the correct value.

Thank you.

Edit :

Here is a reproductible example :

#include <stdio.h>
#include <stdint.h>


void dump_bytes_as_hex( uint8_t* b, int count )
{
    FILE* f;

    f = stdout;

    for( int c = 0; c < count; ++c )
    {
        fprintf( f, "%02X", b[c] );
        fputc( ' ', f );
    }
    fputc( '\n', f );
    fflush( f );
}

void test( uint8_t i )
{
    uint64_t u;
    uint8_t bytes[2];

    fprintf( stdout, "Test with %d\n", (int) i );

    u = 0;
    bytes[1] = i;

    u =  bytes[1] << 24 ;
    dump_bytes_as_hex( (uint8_t*) &u, 8 );

    u =  ( (uint16_t) bytes[1]) << 24 ;
    dump_bytes_as_hex( (uint8_t*) &u, 8 );

    u =  ( (uint32_t) bytes[1]) << 24 ;
    dump_bytes_as_hex( (uint8_t*) &u, 8 );

    fprintf( stdout, "\n\n");
}



int main()
{

    test( 0xFF );
    test( 0xFF -1  );
    test( 100 );

    return 0;

}

Solution

  • So i wanted to know what is the rule ?

    In fact there is no rule for your specific examples ...

    bytes[1] = 0xFF;
    u =  bytes[1] << 24 ;
    

    ... and ...

    u =  ( (uint16_t) bytes[1]) << 24 ;
    dump_bytes_as_hex( &u, 8 );
    

    ... in the event that your int is 32 bits wide.

    The left operand of a shift operation is is subject to the usual arithmetic conversions, which will have the effect of converting the 8- or 16-bit unsigned value of the left operand to (signed) int and producing a result of that type. If the arithmetic result of a left shift of a value of signed type (i.e. int) is not representable as a value of that type then the behavior is undefined. That is the case here (which is why I said there is no rule).

    In your particular case, it appears that your implementation is performing the shift as if by reinterpreting the left operand as an unsigned 32-bit value, the reinterpreting the result as a (signed) int. The assignment to type uint64_t then proceeds (normally) by converting the negative right operand to ty uint64_t by adding 264 to bring it into the range of that type. (Such conversions are performed only for conversion to unsigned integer types, not to signed ones.) You are working on a little-endian system and printing out the resulting bytes in memory order, so you get:

    00 00 00 FF FF FF FF FF

    On the other hand, if you convert the shift operand to uint32_t on your 32-0bit system, ...

    u =  ( (uint32_t) bytes[1]) << 24 ;
    dump_bytes_as_hex( &u, 8 );
    

    ... then the resulting type is unaffected by the standard arithmetic conversions, and the result of the conversion has that type. uint32_t can represent the arithmetic result of the shift (0xff000000), so that is in fact the result. That value is unchanged when converted to type uint64_t for the assignment.


    General advice: use unsigned types when working with bitwise operations, and especially when performing shifts.