arrays algorithm bit-manipulation dynamic-programming xor

Maximize AND on a sequence of XORs

Problem

We are given 2 arrays a and b both of length n. We build a third array c by rearranging the values in b. The goal is to find the optimal c that maximizes

result = (a[0] ^ c[0]) & (a[1] ^ c[1]) & ... & (a[n - 1] ^ c[n - 1])

where ^ is XOR and & is AND.

Is it possible to do this efficiently? It's straightforward to iterate through all possible permutations of b, but this is infeasible for large n.

More details

The order of the values in a is fixed.
The order of the values in b may be rearranged to form c. That is, starting with b = [1, 2, 3], it may be that the maximum result is obtained when the values are rearranged to c = [2, 1, 3].
b may be rearranged in-place if needed (I do this in the sample code).
Since the optimal c is not necessarily unique, any optimal c may be returned.
Assume all values are 32-bit unsigned integers.
1 <= n <= 10,000.

Test cases

Input:
a = [3, 4, 5]
b = [6, 7, 8]
Output:
c = [8, 7, 6] (result = 3)

Input:
a = [1, 11, 7, 4, 10, 11]
b = [6, 20, 8, 9, 10, 7]
Output:
c = [8, 6, 10, 9, 7, 20] (result = 9)

Input:
a = [0, 1, 2, 4, 8, 16]
b = [512, 256, 128, 64, 32, 16]
Output:
c = [16, 32, 64, 128, 256, 512] (result = 0)

Sample code

And here is the C++ code I used for a naive solution that exhaustively tries all permutations of b (tested on Windows 10 with Visual Studio 2019):

#include <algorithm>  // next_permutation
#include <cstdint>  // uint32_t
#include <iostream>  // i/o
#include <vector>  // vector

using uint = std::uint32_t;
using uvec = std::vector<uint>;

uint andxor(const uvec& a, const uvec& c)
{
    // Start with all bits set
    uint result = -1;

    for (std::size_t i = 0; i < c.size(); ++i)
    {
        result &= a[i] ^ c[i];
    }
    return result;
}

uvec solvePermute(const uvec& a, uvec& b)
{
    // next_permutation expects a pre-sorted input
    std::sort(b.begin(), b.end());

    // Initialize the result with the first permutation of b
    uvec c = b;
    uint bestResult = andxor(a, c);

    // Try all permutations of b to maximize the result of andxor
    while (std::next_permutation(b.begin(), b.end()))
    {
        uint result = andxor(a, b);
        if (result > bestResult)
        {
            bestResult = result;
            c = b;
        }
    }

    return c;
}

int main()
{
    // First test case
    uvec a{ 3, 4, 5 };
    uvec b{ 6, 7, 8 };

    uvec c = solvePermute(a, b);
    uint bestResult = andxor(a, c);

    std::cout << "Maximum result is " << bestResult << " with c = ";
    for (uint ci : c)
    {
        std::cout << ci << " ";
    }

    return 0;
}

Solution

Dillon's answer already has the most important ideas. Using these ideas, we can solve the problem in linear time and space.

The key goal is to make highly significant bits of the result 1, regardless of whether we sacrifice lesser significant bits. If we focus on a single bit k, then we can do the following:

partition the numbers in a that have their k-th bit set to 1 (a1) and 0 (a0), respectively
partition the numbers in b that have their k-th bit set to 1 (b1) and 0 (b0), respectively
if the number of entries in a1 is equal to those in b0, we can set the k-th bit in the result to 1. In this case, we consider the partitioning successful.

The partitions have the following meaning: In our final c, we need to match the entries of a1 to the entries of b0 and the entries of a0 to b1. If we do this, the XOR operation will result in 1 for all entries and the AND operation will produce an overall 1.

No, how can we use this insight in an algorithm? I choose to represent the partitioning of a by indices (i.e., the partitioning is a set of sets of indices) and the partitioning of b by actual numbers. Initially, we start with a partitioning with only one set each (i.e., the partitioning for a has the set of all indices and the partitioning for b has b as an element). We start with the most significant bit and try to do the partitioning.

If we have a successful partitioning, we end up with two partitions for both a and b (one of them might be empty). Then we already know which numbers from b we can put at what indices. If we violate this result, we get a lesser end result.

If our partitioning is not successful, we simply ignore this step.

Now let's go on the the next bit. We might already have a partitioning that does not only have the initial sets, but something more fine-grained. We don't want to mix the partitions up. So we can partition the partitions with the same approach as before. If we are successful for all partitions, we use the sub-partitions from now on. If not, we use the original partitions.

If we do this for all bits, we end up with a mapping for the numbers in b and the indices they can be placed in to achieve the maximum final result. It might not be a unique mapping. If a partition contains more than one element, any mapping will produce the maximum. So we just need to choose one and have the result.

Here is one example from your question:

a = {    1,    11,     7,     4,    10,    11  }
  = { 0001b, 1011b, 0111b, 0100b, 1010b, 1011b }
b = {    6,     20,     8,     9,    10,     7  }
  = { 0110b, 10100b, 1000b, 1001b, 1010b, 0111b }

And here are the most important algorithm steps:

               index partitioning    |   b partitioning
 -----------+------------------------+-----------------------
initial     | { 0, 1, 2, 3, 4, 5 }   |  {6, 20, 8, 9, 10, 7 }
------------+------------------------+-----------------------
after bit 3 | { 1, 4, 5 }            |  { 6, 20, 7 }
            | { 0, 2, 3 }            |  { 8, 9, 10 }
------------+------------------------+-----------------------
after bit 0 | { 1, 5 }               |  { 6, 20 }
(final)     | { 4 }                  |  { 7 }
            | { 0, 2 }               |  { 8, 10 }
            | { 3 }                  |  { 9 }

So we have a non-unique case. Numbers 6 and 20 could go both at indices 1 and 5. But number 7 must definitely go at index 4. One solution would be:

c = { 8, 6, 10, 9, 7, 20 }

Checking:

a = { 0001b, 1011b, 0111b, 0100b, 1010b,  1011b }
XOR
c = { 1000b, 0110b, 1010b, 1001b, 0111b, 10100b }
-------------------------------------------------
    { 1001b, 1101b, 1101b, 1101b, 1101b, 11111b }

AND = 1001b = 9

And here is some example C++ code. Note that the code's focus is on understandability. There are a few things that can be implemented more efficiently.

#include <iostream>
#include <vector>
#include <cstdint>

struct Partition
{
    std::vector<size_t> indices;
    std::vector<uint32_t> bs;
};

struct Partitioning
{
    bool success;
    Partition p1;
    Partition p2;
};

Partitioning partition(const std::vector<uint32_t>& a, const std::vector<size_t>& indices, const std::vector<uint32_t>& b, size_t bit)
{
    uint32_t mask = 1 << bit;

    Partitioning result;

    // partition the indices of a
    for (size_t i : indices)
    {
        uint32_t n = a[i];
        if (n & mask)
            result.p1.indices.push_back(i);
        else
            result.p2.indices.push_back(i);
    }
    
    // partition b
    for (uint32_t n : b)
        if (n & mask)
            result.p2.bs.push_back(n);
        else
            result.p1.bs.push_back(n);

    // check if we are successful
    bool canMakeBit1 = result.p1.indices.size() == result.p1.bs.size();
    result.success = canMakeBit1;

    return result;
}

void findMax(const std::vector<uint32_t>& a, const std::vector<uint32_t>& b)
{
    std::vector<uint32_t> aIndices(a.size());
    for (size_t i = 0; i < a.size(); ++i)
        aIndices[i] = i;

    // current partitioning
    std::vector<std::vector<uint32_t>> partsIndices;
    partsIndices.push_back(aIndices);
    std::vector<std::vector<uint32_t>> partsBs;
    partsBs.push_back(b);

    // temporary partitionings
    std::vector<Partitioning> partitionings;

    // assume 32 bits
    size_t bit = 32;
    do
    {
        --bit;

        bool success = true;
        partitionings.clear();
        
        // try to partition all current partitions
        for (size_t i = 0; i < partsIndices.size(); ++i)
        {
            partitionings.push_back(partition(a, partsIndices[i], partsBs[i], bit));
            if (!partitionings.back().success)
            {
                success = false;
                break;
            }
        }

        // if all partitionings are successful
        if (success)
        {
            // replace the current partitioning with the new one
            partsIndices.clear();
            partsBs.clear();
            for (auto& p : partitionings)
            {
                if (p.p1.indices.size() > 0)
                {
                    partsIndices.push_back(p.p1.indices);
                    partsBs.push_back(p.p1.bs);
                }

                if (p.p2.indices.size() > 0)
                {
                    partsIndices.push_back(p.p2.indices);
                    partsBs.push_back(p.p2.bs);
                }
            }
        }
    } while (bit > 0);

    // Generate c
    std::vector<uint32_t> c(a.size());    
    for (size_t i = 0; i < partsIndices.size(); ++i)
    {
        const auto& indices = partsIndices[i];
        const auto& bs = partsBs[i];
        for (size_t j = 0; j < indices.size(); ++j)
        {
            c[indices[j]] = bs[j];
        }
    }

    // Print the result
    uint32_t result = 0xffffffff;
    for (size_t i = 0; i < a.size(); ++i)
    {
        std::cout << c[i] << " ";
        result = result & (a[i] ^ c[i]);
    }
    std::cout << std::endl << result << std::endl;
}

int main()
{
    std::vector<uint32_t> a = { 1, 11, 7, 4, 10, 11 };
    std::vector<uint32_t> b = { 6, 20, 8, 9, 10, 7 };
    
    findMax(a, b);

    return 0;
}