Minimum number of inversions to make one array identical to another

I'm trying to create an efficient algorithm to solve a problem that provides 2 arrays, and asks what's the minimum number of inversions in array B to make it identical to identical to array A (inversions = only adjacent elements can be swapped). Both arrays are guaranteed to contain the same elements. Examples:

a) A = ['a', 'b', 'c', 'd', 'e'] and B = ['b', 'e', 'd', 'c', 'a']. The minimum number of inversions would be 7 by doing the following inversions on array B: (e d), (e c), (e a), (d c), (d a), (c a) and (b a).

b) A = ['a', 'e', 'c', 'd', 'b'] and B = ['a', 'c', 'e', 'b', 'd'] would need 2 inversions, because: (c e) and (b d)

Tried it with the most basic algorithm I could think of, findind the index of element A[i] within B and placing it in the correct place but that's O(n^2). It becomes inefficient with larger inputs and ends up not solving the problem.

I was wondering what could be a better algorithm for this

Solution

Algorithm

First, observe that we can count the number of adjacent swaps needed to sort the array by counting inversions: Pairs of indices (i, j) that are the wrong way around: i < j but a[j] > a[i].

Why is this equivalent to your problem? Let's take two elements b, a that are in the wrong order:

..., b, a, ...

now a swap will change this to

..., a, b, ...

That is, this "removes" a single inversion. You can choose your swaps such that each swaps removes an inversion. So the number of swaps of adjacent elements needed is indeed equal to the number of inversions.

You can count inversions in quadratic time using a naive algorithm, but that is of course too slow.

A better approach is to solve this using a variation of Merge Sort in O(n log n) time:

In addition to the sorted subarray, return an integer indicating the number of inversions in the subarray. Sum these counts for the recursive calls of mergesort with the inversions counted during merging:

When two elements from left and right are equal or the left element is smaller than the right one, everything is fine.
If however the right element is greater than the left one, we have an inversion of the right element with all the remaining elements in the left subarray.

Now, with your current problem statement, there's a tiny twist: You have a "target" array, so the "order" your elements have to be in is not the "normal" order (say, ascending for integers), but rather the order set out in your "target" list. Luckily this is easy to deal with: Just redefine our comparison function! Simply build an index (f.e. hash table) from element to index in the array, and compare indices!

Implementation

Here's a Lua implementation of this, in case you get stuck with implementing any details, but I recommend you to try to work out the details yourself; it's a good exercise.

(One thing to be careful with when reading the code: Unlike many other programming languages, indices in Lua start at 0; you do not need to pre-allocate arrays.)

If you don't know Lua, just treat this as pseudocode. It should be quite readable given Lua's friendly syntax :)

-- Merges two sorted lists; elements of `left` come before those of `right`
local function merge(left, right, less_than)
    local result = {}
    local inversions = 0
    local i, j, k = 1, 1, 1
    while i <= #left and j <= #right do
        -- Compare "head" element, insert "winner"
        if less_than(right[j], left[i]) then
            -- Right list came first, this is an inversion with all the larger elements left in the left list
            inversions = inversions + (#left - i + 1)
            result[k] = right[j]
            j = j + 1
        else
            result[k] = left[i]
            i = i + 1
        end
        k = k + 1
    end
    -- Add remaining elements of either list or other_list
    for offset = 0, #left - i do
        result[k + offset] = left[i + offset]
    end
    for offset = 0, #right - j do
        result[k + offset] = right[j + offset]
    end
    return result, inversions
end

-- Splits a list into two halves of roughly equal size
local function split(list)
    local left, right = {}, {}
    local mid = math.floor(#list / 2)
    for i = 1, mid do
        left[i] = list[i]
    end
    for i = mid + 1, #list do
        table.insert(right, list[i])
    end
    return left, right
end

-- Sorts a list, returning a sorted list and the number of inversions
local function mergesort(list, less_than)
    if #list <= 1 then
        return list, 0 -- already sorted
    end
    local left, right = split(list)
    -- Sort left half & count inversions in left half
    local left_sorted, left_inversions = mergesort(left, less_than)
    -- Sort right half & count inversions in right half
    local right_sorted, right_inversions = mergesort(right, less_than)
    -- Merge halves, count inversions *between* left & right half
    local sorted, left_right_inversions = merge(left_sorted, right_sorted, less_than)
    return sorted, left_inversions + right_inversions + left_right_inversions
end

local function count_inversions(from_list, to_list)
    -- Build index from element to index in target list
    local idx = {}
    for i, v in ipairs(to_list) do
        idx[v] = i
    end
    -- Compare according to wanted indices in target list
    local function less_than(a, b)
        return idx[a] < idx[b]
    end
    local _, inversions = mergesort(from_list, less_than)
    return inversions
end

print(count_inversions({'a', 'b', 'c', 'd', 'e'}, {'b', 'e', 'd', 'c', 'a'})) -- 7
print(count_inversions({'a', 'e', 'c', 'd', 'b'}, {'a', 'c', 'e', 'b', 'd'})) -- 2