Search code examples
csign-extension

Casting signed to unsigned and vise versa while widening the byte count


uint32_t a = -1;            // 11111111111111111111111111111111
int64_t b = (int64_t) a;    // 0000000000000000000000000000000011111111111111111111111111111111

int32_t c = -1;             // 11111111111111111111111111111111
int64_t d = (int64_t) c;    // 1111111111111111111111111111111111111111111111111111111111111111

From the observation above, it appears that only the original value's sign matters. I.e if the original 32 bit number is unsigned, casting it to a 64 bit value will add 0's to its left regardless of the destination value being signed or unsigned and;

if the original 32 bit number is signed and negative, casting it to a 64 bit value will add 1's to its left regardless of the destination value being signed or unsigned.

Is the above statement correct?


Solution

  • Correct, it's the source operand that dictates this.

    uint32_t a = -1;
    int64_t b = (int64_t) a;
    

    No sign extension happens here because the source value is an unsigned uint32_t. The basic idea of sign extension is to ensure the wider variable has the same value (including sign). Coming from an unsigned integer type, the value is positive, always. This is covered by the standards snippet /1 below.

    Negative sign extension (in the sense that the top 1-bit in a two's complement value is copied to all the higher bits in the wider type(a)) only happens when a signed type is extended in width, since only signed types can be negative.


    If the original 32 bit number is signed and negative, casting it to a 64 bit value will add 1's to its left regardless of the destination value being signed or unsigned.

    This is covered by the standards snippet /2 below. You still have to maintain the sign of the value when extending the bits but pushing a negative value (assuming the source was negative) into an unsigned variable will simply mathematically add the MAX_VAL + 1 to the value until it is within the range of the target type (in reality, for two's complement, no adding is done, it just interprets the same bit pattern in a different way).


    Both these scenarios are covered in the standard, in this case C11 6.3.1.3 Signed and unsigned integers /1 and /2:

    1/ When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.

    2/ Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

    3/ Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

    Note that your widening conversions are covered by the first two points above. I've included the third point for completion as it covers things like conversion from uint32_t to int32_t, or unsigned int to long where they have the same width (they both have a minimum range but there's no requirement that unsigned int be "thinner" than long).


    (a) This may be different in ones' complement or sign-magnitude representations but, since they're in the process of being removed, nobody really cares that much.

    See:

    for more detail.

    In any case, the fixed width types are two's complement so you don't have to worry about this aspect for your example code.