Search code examples
cunsignedsigned

How does C handle sign extension?


I have a pointer to a buffer of bytes from which I am copying every even indexed bytes to an int(because of the protocol that the data is stored into buffer I know the odd cycles are for read). Now when I do this

signed int a;
...
//inside a loop
a = buffer[2*i]; //buffer is unsigned

It gives me an unsigned number. However when I do this

a = (int8_t)buffer[2*i]

the number is presented in signed form. That is forcing me to rethink how sign extension in c work, especially in scenarios like above. My understanding was since I am declaring a as signed, compiler will automatically do the sign extension. Can anybody take some time to explain why this is not the case. I just spent an hour in this trap and don't want to fall in the same trap in future again.


Solution

  • buffer is an array of unsigned eight-bit integers (or acts as one). So the value of buffer[2*i] is in the range from 0 to 255 (inclusive), and all values in that range are representable as ints, so assigning

    a = buffer[2*i];
    

    preserves the value, the promotion to the wider type int is done by padding with zeros.

    If you cast to int8_t before assigning,

    a = (int8_t)buffer[2*i]
    

    values in the buffer larger than 127 are converted in an implementation-defined way to type int8_t, most likely by just reinterpreting the bit-pattern as a signed 8-bit integer, which results in a negative value, from -128 to -1. These values are representable as ints, so they are preserved in the assignment, the value-preserving promotion to the wider type int is then done by sign-extension.