I am having trouble reconciling the two facts mentioned above.
If we look at the example of -128 here, the following steps are taken while encoding it
10000000
01111111
10000000
My question is: Where is the sign bit
for a negative integer? In other words, I want to understand how 10000000
gets decoded to -
128 and not -
0
I already saw this question, but mine's different because I get perfectly well that +128 can't be stored in 1-byte because its signed binary would translate to 0100000000
which would require 9 bits.
If we look at the example of -128 here, the following steps are taken while storing it…
Not when storing it, when calculating/converting it. When a compiler is processing -128
in source code or when we see it on paper and work with it, we do whatever computations we want. We can use bits or digits or marks on paper for whatever we want. When we produce the final answer, then the bits in that final answer have their final meanings. The intermediate steps do not have to use bits in the same way.
Given “128”, we calculate this is 10000000 in pure binary (no sign). Then we can calculate its two’s complement representation by complementing the bits to 01111111 and adding 1 (still in pure binary, no sign) to get 10000000. Then these same bits are the two’s complement representation.
When the byte is interpreted, including when it is used in arithmetic or converted from two’s complement representation to decimal, the high bit will be interpreted as a sign bit. But, again, we do not need to use bits in the same way throughout the computation. We can take 10000000, see the high bit is set to tell us the number is negative, and then take its two’s complement as before: Complement the bits to 01111111 and then add one to make 10000000. Now we have the same bits, but they are a pure binary number, with no sign. They represent 128, and we know it is negative because we observed the original sign bit earlier.
Also note that signed char x
and unsigned char y
use the same bit patterns to represent different values. When x
has the bit pattern 11111111, it represents −1. When y
has the bit pattern 11111111, it represents 255. To make this work, the compiler will use different instructions for operations with x
than for operations with y
. There are different instructions for working with signed types than for working with unsigned types. (Many of them overlap in large part; addition and subtraction are often performed with the same instructions, but the flag results are interpreted differently to detect overflow and other conditions.)
Additionally, for this single-byte example, the compiler generally does not work with it as a char
. In source text, 128
is an int
constant. Internally, a compiler likely converts 128
to a 32-bit int
and then negates it to make −128, with bits 11111111111111111111111110000000, and then, to store it in a signed char
, it takes the low eight bits, 10000000. (This may vary depending on the compiler.)
Interestingly, this boundary issue does affect the type of -2147483648
. Consider a C implementation that uses 32-bit int
and a 64-bit long
. −2,147,483,648 is representable in a 32-bit int
, but, in the C grammar, -2147483648
is not a constant but is a combination of -
and 2147483648
. And, since 2,147,483,648 is not representable in 32 bits, it is long
constant. So the type of -2147483648
is long
. You can verify this with:
printf("%zu %zu\n", sizeof -2147483647, sizeof -2147483648);
It will print "4 8" in C implementations with a 32-bit int
.
(Which raises the issue of how is INT_MIN
defined. It must have the value −2,147,483,648, but it must have type int
.)