Search code examples
c#algorithmperformancecombinationsbirthday-paradox

Speed up processing 32 bit numbers in combinations (k from n)


I have a list of 128 32 bit numbers, and I want to know, is there any combination of 12 numbers, so that all numbers XORed give the 32 bit number with all bits set to 1.

So I have started with naive approach and took combinations generator like that:

        private static IEnumerable<int[]> Combinations(int k, int n)
        {
            var state = new int[k];
            var stack = new Stack<int>();
            stack.Push(0);

            while (stack.Count > 0)
            {
                var index = stack.Count - 1;
                var value = stack.Pop();

                while (value < n)
                {
                    state[index++] = value++;
                    if (value < n)
                    {
                        stack.Push(value);
                    }

                    if (index == k)
                    {
                        yield return state;
                        break;
                    }
                }
            }
        }

and used it like that (data32 is an array of given 32bit numbers)

foreach (var probe in Combinations(12, 128))
{
   int p = 0;
   foreach (var index in probe)
   {
      p = p ^ data32[index];
   }
   if (p == -1)
   {
      //print out found combination
   }
}

Of course it takes forever to check all 23726045489546400 combinations... So my question(s) are - am I missing something in options how to speedup the check process? Even if I do the calculation of combinations in partitions (e.g. I could start like 8 threads each will check combination started with numbers 0..8), or speed up the XORing by storing the perviously calculated combination - it is still slow.

P.S. I'd like it to run in reasonable time - minutes, hours not years. Adding a list of numbers as was requested in one of the comments:

1571089837 2107702069 466053875 226802789 506212087 484103496 1826565655 944897655 1370004928 748118360 1000006005 952591039 2072497930 2115635395 966264796 1229014633 827262231 1276114545 1480412665 2041893083 512565106 1737382276 1045554806 172937528 1746275907 1376570954 1122801782 2013209036 1650561071 1595622894 425898265 770953281 422056706 477352958 1295095933 1783223223 842809023 1939751129 1444043041 1560819338 1810926532 353960897 1128003064 1933682525 1979092040 1987208467 1523445101 174223141 79066913 985640026 798869234 151300097 770795939 1489060367 823126463 1240588773 490645418 832012849 188524191 1034384571 1802169877 150139833 1762370591 1425112310 2121257460 205136626 706737928 265841960 517939268 2070634717 1703052170 1536225470 1511643524 1220003866 714424500 49991283 688093717 1815765740 41049469 529293552 1432086255 1001031015 1792304327 1533146564 399287468 1520421007 153855202 1969342940 742525121 1326187406 1268489176 729430821 1785462100 1180954683 422085275 1578687761 2096405952 1267903266 2105330329 471048135 764314242 459028205 1313062337 1995689086 1786352917 2072560816 282249055 1711434199 1463257872 1497178274 472287065 246628231 1928555152 1908869676 1629894534 885445498 1710706530 1250732374 107768432 524848610 2791827620 1607140095 1820646148 774737399 1808462165 194589252 1051374116 1802033814


Solution

  • I don't know C#, I did something in Python, maybe interesting anyway. Takes about 0.8 seconds to find a solution for your sample set:

    solution = {422056706, 2791827620, 506212087, 1571089837, 827262231, 1650561071, 1595622894, 512565106, 205136626, 944897655, 966264796, 477352958}
    len(solution) = 12
    solution.issubset(nums) = True
    hex(xor(solution)) = '0xffffffff'
    

    There are 128C12 combinations, that's 5.5 million times as many as the 232 possible XOR values. So I tried being optimistic and only tried a subset of the possible combinations. I split the 128 numbers into two blocks of 28 and 100 numbers and try combinations with six numbers from each of the two blocks. I put all possible XORs of the first block into a hash set A, then go through all XORs of the second block to find one whose bitwise inversion is in that set. Then I reconstruct the individual numbers.

    This way I cover (28C6)2 × (100C6)2 = 4.5e14 combinations, still over 100000 times as many as there are possible XOR values. So probably still a very good chance to find a valid combination.

    Code (Try it online!):

    from itertools import combinations
    from functools import reduce
    from operator import xor as xor_
    
    nums = list(map(int, '1571089837 2107702069 466053875 226802789 506212087 484103496 1826565655 944897655 1370004928 748118360 1000006005 952591039 2072497930 2115635395 966264796 1229014633 827262231 1276114545 1480412665 2041893083 512565106 1737382276 1045554806 172937528 1746275907 1376570954 1122801782 2013209036 1650561071 1595622894 425898265 770953281 422056706 477352958 1295095933 1783223223 842809023 1939751129 1444043041 1560819338 1810926532 353960897 1128003064 1933682525 1979092040 1987208467 1523445101 174223141 79066913 985640026 798869234 151300097 770795939 1489060367 823126463 1240588773 490645418 832012849 188524191 1034384571 1802169877 150139833 1762370591 1425112310 2121257460 205136626 706737928 265841960 517939268 2070634717 1703052170 1536225470 1511643524 1220003866 714424500 49991283 688093717 1815765740 41049469 529293552 1432086255 1001031015 1792304327 1533146564 399287468 1520421007 153855202 1969342940 742525121 1326187406 1268489176 729430821 1785462100 1180954683 422085275 1578687761 2096405952 1267903266 2105330329 471048135 764314242 459028205 1313062337 1995689086 1786352917 2072560816 282249055 1711434199 1463257872 1497178274 472287065 246628231 1928555152 1908869676 1629894534 885445498 1710706530 1250732374 107768432 524848610 2791827620 1607140095 1820646148 774737399 1808462165 194589252 1051374116 1802033814'.split()))
    
    def xor(vals):
        return reduce(xor_, vals)
    
    A = {xor(a)^0xffffffff: a
         for a in combinations(nums[:28], 6)}
    
    for b in combinations(nums[28:], 6):
        if a := A.get(xor(b)):
            break
    
    solution = {*a, *b}
    
    print(f'{solution = }')
    print(f'{len(solution) = }')
    print(f'{solution.issubset(nums) = }')
    print(f'{hex(xor(solution)) = }')