Python XOR behavior with a mix of positive/negative number

Here is two results I get when I xor 2 integers. The sames bits, but a different sign for the second parameter of the xor.

>>> bin(0b0001 ^ -0b0010)
'-0b1'
>>> bin(0b0001 ^  0b0010)
'0b11'

I don't really understand the logic. Isn't XOR just supposed so XOR every bit one by one ? Even with signed bits ? I would expect to get the same results (with a different sign).

Solution

If python's integers were fixed-width (eg: 32-bit, or 64-bit), a negative number would be represented in 2's complement form. That is, if you want -a, then take the bits of a, invert them all, and then add 1. Then a ^ b is just the number that's represented by the bitwise xor of the bits of a and b in two's complement. The result is re-interpreted in two's complement (ie: negative if the top bit is set).

Python's int type isn't fixed-width, but the result of a ^ b follows the same pattern: imagine that the values are represented as a wide-enough fixed-with int type, and then take the xor of the two values.

Although this now seems a bit arbitrary, it makes sense historically: Python adopted many operations from C, so xor was defined to work like in C. Python had a fixed-width integer type like C, and having a ^ b give the same result for the fixed-width and arbitary-width integer types essentially forces the current definition.

Back to a worked example: 1 ^ -2. 8 bits is more than enough to represent these two values. In 2's complement:

 1 = 00000001
-2 = 11111110

Then the bitwise xor is:

   = 11111111

This is the 8-bit 2's complement representation of -1. Although we've used 8 bits here, the result is the same no matter the width chosen as long as it's enough to represent the two values.