I'm learning the simple XOR encryption algorithm in c++.
The next code works fine:
void test(int8_t* data, const int data_length) {
const uint8_t key = 123;
for (int index = 0; index < data_length; index++)
data[index] = data[index] ^ key;
}
The data that I am given is signed, therefore has a type of int8_t.
The problem is that the compiler shows the next warning:
"Use of a signed integer operand with a binary bitwise operator”
I can make the warning go by casting data with uint8_t when performing the XOR operation, but I don't know the implications. I've done some test and doesn't seem to be a problem, but I am confused because data can contain signed values, so I am not sure if by casting it I am messing the data.
Is it correct to cast to uint8_t even if data can contain negative values? or should I ignore the warning?
The compiler is giving the warning because bitwise operations are not supposed to be performed on signed integers. In C++ before C++20, there were allowed different representations of signed integers, meaning that the same number could be represented by different bit patterns on different machines and compilers. This makes the result of bit manipulations on signed integers non-portable. Granted, intN_t
were always required to use two's complement representation (and C++20 extended that requirement to all signed integers), it is still not recommended to use signed integers for bitwise operations.
In your particular case, both data[index]
and key
get promoted to int
to perform the XOR operation. However, since data[index]
is a signed integer, its value gets sign-extended, and the unsigned key
gets zero-extended. This means the XOR affects only the low 8 bits of the intermediate int
values, and the result may not fit in int8_t
range. When you assign the result back to data[index]
, a signed overflow can happen, which is UB in C++ (prior to C++20; since C++20 it is well defined to truncate the upper bits).
The correct thing to do in this case is to treat your data as an array of raw bytes, regardless of what values these bytes represent. This means, you should be using std::byte
or std::uint8_t
to represent input and output data. This way you will be operating on unsigned integers and have no portability or potential overflow issues.