Search code examples
ccharhexunsignedsigned

Weird behaviour in C when hex value(over 0x7f) is assigned to (signed) char


When a hex value "0x000000a1" is assigned to (signed) char, char is equal to "0xffffffa1". Can anyone explain this weird behaviour?

void main(){
    char testVar = 0x000000a1;

    printf("%x \n",testVar); //prints ffffffa1
    printf("%d \n",testVar); //prints -95

}

It works as expected when you initialize an unsigned char

void main(){
    unsigned char testVar = 0x000000a1;

    printf("%x \n",testVar); //prints a1
    printf("%d \n",testVar); //prints 161

}

also works as expected when you assign a value within the ASCII limit of 127(or 0x7f)

void main(){
    unsigned char testVar = 0x7f;

    printf("%x \n",testVar); //prints 7f
    printf("%d \n",testVar); //prints 127

}

what I understand now after asking the question:

  • You can't assign 0x000000a1 to a char, because a char is only 1 byte long. You only assign the a1 part.
  • The first bit in a signed value indicates if it is negative or not (1000 0000 is negative and equals to -128. 0111111 is not negative and equals to 127)
  • For some reason printf needs to extend the char variable to int before it outputs the value.
  • when printf is converting the signed char to int, it recognises when char is negative and therefore preserving the negative value by extending with 1's instead of 0's. (1000 0000b(-128dez) is extended to 1111 1111 ... 1111 1111 1000 0000b(still -128dez))

Solution

  • You can't assign 0x000000a1 to a char, because a char is only 2 bytes long. You only assign the a1 part.

    A char is by definition always 1 byte big. (It could have more than 8 bits on exotic systems though.) The key here is rather than char has implementation-defined signedness: it could be either signed or unsigned depending on the compiler. Is char signed or unsigned by default? Apparently it is signed in your case.

    The amount of zeroes in front of the value doesn't matter in the slightest and do not affect the type. All they do is to fool the programmer thinking they have some meaning.

    The first bit in a signed value indicates if it is negative or not (1000 0000 is negative and equals to -128. 0111111 is not negative and equals to 127)

    Yes, if it is the MSB.

    What happens when you assing 0xa1 to a signed char is also compiler-specific. On mainstream systems this will result in a negative two's complement value -95 decimal.

    or some reason printf needs to extend the char variable to int before it outputs the value.

    printf is a variadic function (variable amount of arguments) and for those functions there's a special implicit type promotion rule called "the default argument promotions". When passing a small integer type such as char or short to a variadic function, it will always get implicitly promoted to int. (And when passing float it will get promoted to double.)

    when printf is converting the signed char to int, it recognises when char is negative and therefore preserving the negative value by extending with 1's instead of 0's

    Correct. During this promotion, if your original type was signed and has a negative value, this is respected and the sign is preserved, so called "sign extension". So in the binary representation of things, rather than having (signed) char 0xa1 = -95, you get int 0xffffffa1 = still -95.

    You then lie to printf with %x telling it to expect an unsigned int parameter. Strictly speaking this is undefined behavior, but any sensible compiler will convert the int to unsigned int and that's a well-defined conversion, leaving you with 0xffffffa1 but now it's unsigned representation, so it is equal to decimal value 4294967201.