Search code examples
c++c++17compiler-optimizationbit-shiftinteger-promotion

C++ Bitshift in one line influenced by processor bit width (Bug or Feature?)


I encountered a strange problem, but to make it clear see the code first:

#include <stdio.h>
#include <stdint.h>

int main() {
    uint8_t a = 0b1000'0000; // -> one leftmost bit
    uint8_t b = 0b1000'0000;
    
    a = (a << 1) >> 1; // -> Both shifts in one line
    
    b = b << 1; // -> Shifts separated into two individual lines
    b = b >> 1;
    
    printf("%i != %i", a, b);

    return 0;
}

(using C++ 17 on a x86 machine)

If you compile the code, b is 0 while a is 128. On a general level, this expressions should not be tied to the processors architecture or its bit width, I would expect both to be 0 after the operation

The bitshift right operator is defined to fill up the left bits with zero, as the example with b proves.

If I look at the assembler code, I can see that for b, the value is loaded from RAM into a register, shifted left, written back into RAM, read again from RAM into a register and then shifted write. On every write back into RAM, the truncation to 8 bit integer is done, removing the leftmost 1 from the byte, as it is out of range for an 8-bit integer.

For a on the other hand, the value is loaded in a register (based on the x86 architecture, 32-bit wide), shifted left, then shifted right again, shifting the 1 just back where it was, caused by the 32-bit register.

My question is now, is this one-line optimization for a a correct behavior and should be taken in account while writing code, or is it a compiler bug to be reported to the compiler developers?


Solution

  • What you're seeing is the result of integer promotion. What this means is that (in most cases) anyplace that an expression uses a type smaller than int, that type gets promoted to int.

    This is detailed in section 7.6p1 of the C++17 standard:

    A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer conversion rank (7.15) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int

    So in this expression:

    a = (a << 1) >> 1
    

    The value of a on the right side is promoted from the uint8_t value 0x80 to the int value 0x00000080. Shifting left by one gives you 0x00000100, then shifting right again gives you 0x00000080. That value is then truncated to the size of a uint8_t to give you 0x80 when it is assigned back to a.

    In this case:

    b = b << 1; 
    

    The same thing happens to start: 0x80 is promoted to 0x00000080 and the shift gives you 0x00000100. Then this value is truncated to 0x00 before being assigned to b.

    So this is not a bug, but expected behavior.