Search code examples
algorithmperformanceoptimization

Minimum number of inversions to make one array identical to another


I'm trying to create an efficient algorithm to solve a problem that provides 2 arrays, and asks what's the minimum number of inversions in array B to make it identical to identical to array A (inversions = only adjacent elements can be swapped). Both arrays are guaranteed to contain the same elements. Examples:

a) A = ['a', 'b', 'c', 'd', 'e'] and B = ['b', 'e', 'd', 'c', 'a']. The minimum number of inversions would be 7 by doing the following inversions on array B: (e d), (e c), (e a), (d c), (d a), (c a) and (b a).

b) A = ['a', 'e', 'c', 'd', 'b'] and B = ['a', 'c', 'e', 'b', 'd'] would need 2 inversions, because: (c e) and (b d)

Tried it with the most basic algorithm I could think of, findind the index of element A[i] within B and placing it in the correct place but that's O(n^2). It becomes inefficient with larger inputs and ends up not solving the problem.

I was wondering what could be a better algorithm for this


Solution

  • Algorithm

    First, observe that we can count the number of adjacent swaps needed to sort the array by counting inversions: Pairs of indices (i, j) that are the wrong way around: i < j but a[j] > a[i].

    Why is this equivalent to your problem? Let's take two elements b, a that are in the wrong order:

    ..., b, a, ...

    now a swap will change this to

    ..., a, b, ...

    That is, this "removes" a single inversion. You can choose your swaps such that each swaps removes an inversion. So the number of swaps of adjacent elements needed is indeed equal to the number of inversions.

    You can count inversions in quadratic time using a naive algorithm, but that is of course too slow.

    A better approach is to solve this using a variation of Merge Sort in O(n log n) time:

    In addition to the sorted subarray, return an integer indicating the number of inversions in the subarray. Sum these counts for the recursive calls of mergesort with the inversions counted during merging:

    • When two elements from left and right are equal or the left element is smaller than the right one, everything is fine.
    • If however the right element is greater than the left one, we have an inversion of the right element with all the remaining elements in the left subarray.

    Now, with your current problem statement, there's a tiny twist: You have a "target" array, so the "order" your elements have to be in is not the "normal" order (say, ascending for integers), but rather the order set out in your "target" list. Luckily this is easy to deal with: Just redefine our comparison function! Simply build an index (f.e. hash table) from element to index in the array, and compare indices!

    Implementation

    Here's a Lua implementation of this, in case you get stuck with implementing any details, but I recommend you to try to work out the details yourself; it's a good exercise.

    (One thing to be careful with when reading the code: Unlike many other programming languages, indices in Lua start at 0; you do not need to pre-allocate arrays.)

    If you don't know Lua, just treat this as pseudocode. It should be quite readable given Lua's friendly syntax :)

    -- Merges two sorted lists; elements of `left` come before those of `right`
    local function merge(left, right, less_than)
        local result = {}
        local inversions = 0
        local i, j, k = 1, 1, 1
        while i <= #left and j <= #right do
            -- Compare "head" element, insert "winner"
            if less_than(right[j], left[i]) then
                -- Right list came first, this is an inversion with all the larger elements left in the left list
                inversions = inversions + (#left - i + 1)
                result[k] = right[j]
                j = j + 1
            else
                result[k] = left[i]
                i = i + 1
            end
            k = k + 1
        end
        -- Add remaining elements of either list or other_list
        for offset = 0, #left - i do
            result[k + offset] = left[i + offset]
        end
        for offset = 0, #right - j do
            result[k + offset] = right[j + offset]
        end
        return result, inversions
    end
    
    -- Splits a list into two halves of roughly equal size
    local function split(list)
        local left, right = {}, {}
        local mid = math.floor(#list / 2)
        for i = 1, mid do
            left[i] = list[i]
        end
        for i = mid + 1, #list do
            table.insert(right, list[i])
        end
        return left, right
    end
    
    -- Sorts a list, returning a sorted list and the number of inversions
    local function mergesort(list, less_than)
        if #list <= 1 then
            return list, 0 -- already sorted
        end
        local left, right = split(list)
        -- Sort left half & count inversions in left half
        local left_sorted, left_inversions = mergesort(left, less_than)
        -- Sort right half & count inversions in right half
        local right_sorted, right_inversions = mergesort(right, less_than)
        -- Merge halves, count inversions *between* left & right half
        local sorted, left_right_inversions = merge(left_sorted, right_sorted, less_than)
        return sorted, left_inversions + right_inversions + left_right_inversions
    end
    
    local function count_inversions(from_list, to_list)
        -- Build index from element to index in target list
        local idx = {}
        for i, v in ipairs(to_list) do
            idx[v] = i
        end
        -- Compare according to wanted indices in target list
        local function less_than(a, b)
            return idx[a] < idx[b]
        end
        local _, inversions = mergesort(from_list, less_than)
        return inversions
    end
    
    print(count_inversions({'a', 'b', 'c', 'd', 'e'}, {'b', 'e', 'd', 'c', 'a'})) -- 7
    print(count_inversions({'a', 'e', 'c', 'd', 'b'}, {'a', 'c', 'e', 'b', 'd'})) -- 2