EDIT: The answer is, bitwise operations on signed values does weird things!
During a debugging process, I noticed a weird discrepancy, and I have translated it into an easy to read example below.
It would seem to me that var1 and var2 should be identical: after carefully stepping through this on a debugger, it seems that var1 and var2 are identical for the first iteration, but diverge for the second. I discovered this bug while trying to convert the expression for "var2" into assembly, and noticed that my logical translation (which I've displayed with "var1") was giving different results. The calculation for "var1" is, to me, an identical unpicking of the complex expression for "var2" - where am I going wrong?
This was compiled with Visual Community 2019, x64, debug.
// x is an unsigned char, equivalent to the length of the string
// taking the null terminator into account
unsigned char var1 = x;
unsigned char var2 = x;
for (int i = 0; i < x; ++i) {
unsigned char temp1 = string[i];
unsigned char temp2 = var1 ^ temp1;
unsigned char temp3 = table[temp2];
var1 ^= temp3;
var2 ^= table[var2 ^ string[i]];
}
In table[var2 ^ string[i]];
, the var2
has an unsigned char
value of 0 to 255, and string[i]
may have a signed char
value of −128 to +127. (We assume eight-bit bytes and two’s complement, which are ubiquitous in modern systems.)
As is usual with most C operators, the integer promotions are applied to the operands, which, in this case, produces int
operands. For the unsigned char
values 0 to 255, this produces an int
with bits set only in the low eight bits. For char
values −128 to −1, this produces an int
with bits set throughout the int
, particularly in the high bits.
Then the result of the XOR operation is an int
with high bits set, including the sign bit, so it has a negative value. Then table
is indexed with a negative subscript, going outside the bounds of the array. And so the behavior is not defined by the C standard.
To remedy this, change the element type of table
to unsigned char
or convert string[i]
to unsigned char
before using it in bitwise operations.