Search code examples
cbit-manipulationtwos-complementunsigned-integerinteger-promotion

What is the difference between literals and variables in C (signed vs unsigned short ints)?


I have seen the following code in the book Computer Systems: A Programmer's Perspective, 2/E. This works well and creates the desired output. The output can be explained by the difference of signed and unsigned representations.

#include<stdio.h>
int main() {
    if (-1 < 0u) {
        printf("-1 < 0u\n");
    }
    else {
        printf("-1 >= 0u\n");
    }
    return 0;
}

The code above yields -1 >= 0u, however, the following code which shall be the same as above, does not! In other words,

#include <stdio.h>

int main() {

    unsigned short u = 0u;
    short x = -1;
    if (x < u)
        printf("-1 < 0u\n");
    else
        printf("-1 >= 0u\n");
    return 0;
}

yields -1 < 0u. Why this happened? I cannot explain this.

Note that I have seen similar questions like this, but they do not help.

PS. As @Abhineet said, the dilemma can be solved by changing short to int. However, how can one explains this phenomena? In other words, -1 in 4 bytes is 0xff ff ff ff and in 2 bytes is 0xff ff. Given them as 2s-complement which are interpreted as unsigned, they have corresponding values of 4294967295 and 65535. They both are not less than 0 and I think in both cases, the output needs to be -1 >= 0u, i.e. x >= u.

A sample output for it on a little endian Intel system:

For short:

-1 < 0u
u =
 00 00
x =
 ff ff

For int:

-1 >= 0u
u =
 00 00 00 00
x =
 ff ff ff ff

Solution

  • The code above yields -1 >= 0u

    All integer literals (numeric constansts) have a type and therefore also a signedness. By default, they are of type int which is signed. When you append the u suffix, you turn the literal into unsigned int.

    For any C expression where you have one operand which is signed and one which is unsiged, the rule of balacing (formally: the usual arithmetic conversions) implicitly converts the signed type to unsigned.

    Conversion from signed to unsigned is well-defined (6.3.1.3):

    Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

    For example, for 32 bit integers on a standard two's complement system, the max value of an unsigned integer is 2^32 - 1 (4294967295, UINT_MAX in limits.h). One more than the maximum value is 2^32. And -1 + 2^32 = 4294967295, so the literal -1 is converted to an unsigned int with the value 4294967295. Which is larger than 0.


    When you switch types to short however, you end up with a small integer type. This is the difference between the two examples. Whenever a small integer type is part of an expression, the integer promotion rule implicitly converts it to a larger int (6.3.1.1):

    If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

    If short is smaller than int on the given platform (as is the case on 32 and 64 bit systems), any short or unsigned short will therefore always get converted to int, because they can fit inside one.

    So for the expression if (x < u), you actually end up with if((int)x < (int)u) which behaves as expected (-1 is lesser than 0).