Search code examples
cvariadic-functionsinteger-promotion

Char and int16 array element both shown as 32bit hex?


In the example below:

int main(int argc, char *argv[])
{
    int16_t array1[] = {0xffff,0xffff,0xffff,0xffff};
    char array2[] = {0xff,0xff,0xff,0xff};
    printf("Char size: %d \nint16_t size: %d \n", sizeof(char), sizeof(int16_t));

    if (*array1 == *array2)
        printf("They are the same \n");
    if (array1[0] == array2[0])
        printf("They are the same \n");

    printf("%x \n", array1[0]);
    printf("%x \n", *array1);

    printf("%x \n", array2[0]);
    printf("%x \n", *array2);
}

Output:

Char size: 1 
int16_t size: 2 
They are the same 
They are the same 
ffffffff 
ffffffff 
ffffffff 
ffffffff

Why are the 32bit values printed for both char and int16_t and why can they be compared and are considered the same?


Solution

  • They're the same because they're all different representations of -1.

    They print as 32 bits' worth of ff becaue you're on a 32-bit machine and you used %d and the default argument promotions took place (basically, everything smaller gets promoted to int). Try using %hx. (That'll probably get you ffff; I don't know of a way to get ff here other than by using unsigned char, or masking with & 0xff: printf("%x \n", array2[0] & 0xff) .)


    Expanding on "They're the same because they're all different representations of -1":

    int16_t is a signed 16-bit type. It can contain values in the range -32768 to +32767.
    char is an 8-bit type, and on your machine it's evidently signed also. So it can contain values in the range -128 to +127.

    0xff is decimal 255, a value which can't be represented in a signed char. If you assign 0xff to a signed char, that bit pattern ends up getting interpreted not as 255, but rather as -1. (Similarly, if you assigned 0xfe, that would be interpreted not as 254, but rather as -2.)

    0xffff is decimal 65535, a value which can't be represented in an int16_t. If you assign 0xffff to a int16_t, that bit pattern ends up getting interpreted not as 65535, but rather as -1. (Similarly, if you assigned 0xfffe, that would be interpreted not as 65534, but rather as -2.)

    So when you said

    int16_t array1[] = {0xffff,0xffff,0xffff,0xffff};
    

    it was basically just as if you'd said

    int16_t array1[] = {-1,-1,-1,-1};
    

    And when you said

    char array2[] = {0xff,0xff,0xff,0xff};
    

    it was just as if you'd said

    char array2[] = {-1,-1,-1,-1};
    

    So that's why *array1 == *array2, and array1[0] == array2[0].


    Also, it's worth noting that all of this is very much because of the types of array1 and array2. If you instead said

    uint16_t array3[] = {0xffff,0xffff,0xffff,0xffff};
    unsigned char array4[] = {0xff,0xff,0xff,0xff};
    

    You would see different values printed (ffff and ff), and the values from array3 and array4 would not compare the same.

    Another answer stated that "there is no type information in C at runtime". That's true but misleading in this case. When the compiler generates code to manipulate values from array1, array2, array3, and array4, the code it generates (which of course is significant at runtime!) will be based on their types. In particular, when generating code to fetch values from array1 and array2 (but not array3 and array4), the compiler will use instructions which perform sign extension when assigning to objects of larger type (e.g. 32 bits). That's how 0xff and 0xffff got changed into 0xffffffff.