Search code examples
javaalgorithmbinarybitwise-operators

How does XOR really works, and what is the magic behind it?


The question is maybe misleading a little bit but I didn't know how to ask it another way. There is a problem in hackerrank that goes as follow:

Consider an array of integers, where all but one of the integers occur in pairs. In other words, every element in occurs exactly twice except for one unique element.

Given the array find and print the unique element. [2,3,1,3,2] -> result is 1

enter image description here

I solved the problem like this:

private static int lonelyInteger(int[] a) {
        if(a==null)
            return 0;

         if(a.length<2)
             return a.length;

        Set<Integer> set = new HashSet<>();

        for(int i : a){

            if(set.contains(i)){
                set.remove(i);
            }else{
                set.add(i);
            }            
        }

        return (Integer) set.toArray()[0];
    }

However it turned our that there is a neat solution to this problem which is:

private static int lonelyInteger(int[] a) {
         int b = a[0];

         for(int i = 1; i < a.length; i++){
             b ^= a[i];
         }
        return b;       
    }

The problem is that is I don't know WHY IT WORKS ?! I understand HOW it works but don't understand WHY IT WORKS ? To understand that I made a small program to output the results of each step:

public class BitwiseOperator {

    public static void main(String[] args) {
        Scanner in = new Scanner(System.in);
        int n = in.nextInt();
        int[] a = new int[n];
        int sum = 0;
        in.nextLine();
        String line = in.nextLine();
        String[] numbers = line.split(" ");
        for (int i = 0; i < numbers.length; i++) {
            a[i] = Integer.valueOf(numbers[i]);
        }

        for (int i = 0; i < a.length; i++) {
            binary(sum, a[i]);
            sum ^= a[i];
            binary(sum);
            System.out.println();
            System.out.println();
            System.out.println();

        }
        System.out.println(sum);
    }


    private static void binary(int sum) {
        System.out.println(String.format("%16s", Integer.toBinaryString(sum)).replace(' ', '0') + " ->" + sum);
    }

    private static void binary(int sum, int i) {
        System.out.println(String.format("%16s", Integer.toBinaryString(sum)).replace(' ', '0') + " ->" + sum);
        System.out.println("XOR");
        System.out.println(String.format("%16s", Integer.toBinaryString(i)).replace(' ', '0') + " ->" + i);
        System.out.println("--------");
    }


}

If you enter the following input:

5

2 3 2 1 3

The output is:

0000000000000000 ->0
XOR
0000000000000010 ->2
--------
0000000000000010 ->2



0000000000000010 ->2
XOR
0000000000000011 ->3
--------
0000000000000001 ->1



0000000000000001 ->1
XOR
0000000000000010 ->2
--------
0000000000000011 ->3



0000000000000011 ->3
XOR
0000000000000001 ->1
--------
0000000000000010 ->2



0000000000000010 ->2
XOR
0000000000000011 ->3
--------
0000000000000001 ->1



1

So the program actually works BUT I Really need to understand WHY ?


Solution

  • An accurate proof, IMHO, involves group theory (you can build abelian group based on xor):

    1. xor is a group operation
    2. 0 is a group unit item (A xor 0 == A)
    3. A is an inverse element (so every A is inverse to itself: A ^ A = 0).

    Of course we have to prove that (A xor B) xor C == A xor (B xor C) - since xor is a bitwise operation (ith bit doesn't depend on jth bit) all we should do is to prove the statement for an arbitrary bit with a help of truth table.

    Since A xor B == B xor A we have an abelian group and that's why can put items in any order:

      A XOR B XOR C XOR A XOR B ==
     (A XOR A) XOR (B XOR B) XOR C ==
      C
    

    In general case:

       A xor B xor ... xor Distinct_Item xor ... xor Z ==
      (A xor A) xor (B xor B) xor ... xor (Z xor Z) xor Distinct_Item ==
       0 xor 0 xor ... xor Distinct_Item ==
       Distinct_Item