Search code examples
c++cunsigned-integer

Why casting unsigned to signed directly in C gives correct result?


In C, signed integer and unsigned integer are stored differently in memory. C also convert signed integer and unsigned integer implicitly when the types are clear at runtime. However, when I try the following snippet,

#include <stdio.h>

int main() {    
    unsigned int a = 5;
    signed int b = a;
    signed int c = *(unsigned int*)&a;
    signed int d = *(signed int*)&a;

    printf("%u\n", a);
    printf("%i\n", b);
    printf("%i\n", c);
    printf("%i\n", d);

    return 0;
}

with the expected output of:

5
5                   //Implicit conversion occurs
5                   //Implicit conversion occurs, because it knows that *(unsigned int*)&a is an unsigned int
[some crazy number] //a is casted directly to signed int without conversion

However, in reality, it outputs

5
5
5
5

Why?


Solution

  • Your claim that ...

    In C, signed integer and unsigned integer are stored differently in memory

    ... is largely wrong. The standard instead specifies:

    For signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. There need not be any padding bits; signed char shall not have any padding bits. There shall be exactly one sign bit. Each bit that is a value bit shall have the same value as the same bit in the object representation of the corresponding unsigned type (if there are M value bits in the signed type and N in the unsigned type, then M <= N ). If the sign bit is zero, it shall not affect the resulting value.

    (C2011 6.2.6.2/2; emphasis added)

    Thus, although the representation of a signed integer type and its corresponding unsigned integer type (which have the same size) must differ at least in that former has a sign bit and the latter does not, most bits of the representations in fact correspond exactly. The standard requires it. Small(ish), non-negative integers will be represented identically in corresponding signed and unsigned integer types.

    Additionally, some of the comments raised the matter of the "strict aliasing rule", which is paragraph 6.5/7 of the standard. It forbids accessing an object of one type via an lvalue of a different type, as your code does, but it allows some notable exceptions. One of the exceptions is that you may access an object via an lvalue whose type is

    • a type that is the signed or unsigned type corresponding to the effective type of the object,

    That is in fact what your code does, so there is no strict-aliasing violation there.