Search code examples
pythonalgorithmsortingcomputer-science

How can I scramble an array in linear time without having duplicate elements next to one another?


For example:

T = [b, c, b, b, a, c] #each element represents questions related to topics a, b, c

What I want is make sure that no 2 questions from same topic are next to one another. (see the T where b, b are next to eachother)

So I want to rearrange the T in such a that no two questions belonging to same topic are next to each other, i.e. Tnew = [b, c, b, a, b, c]

But condition is that we have to do it linear time i.e. O(n) or Big-O of (n)

The algorithm that I thought of:

1)Create a dict or map to hold the occurrence of each topics:
a --> 1
b --> 3
c --> 2

2) Now based on the counts we can create new array such that:
A = [a, b, b, b, c, c]

3) Now perform unsorting of Array which I believe runs in O(n).
(unsorting is basically find midpoint and then merge the elements alternately from each half.

Can someone please help me design a pseudocode or algorithm that can do this better on any inputs with k number of topics?

This is a random question that I am practing for exam.


Solution

  • There is a linearithmic approach of time complexity O(log c * n) where c is the number of unique items and n is the total number of items.

    It works as follows:

    1. Create a frequency table of the items. This is just a list of tuples that tells how many of each item are available. Let's store the tuples in the list as (quantity, item). This Step is O(n). You could use collections.Counter in python, or collections.defaultdict(int) or a vanilla dict.
    2. Heapify the list of tuples in a max heap. This can be done in O(n). This heap has the items with the largest number of quantity at the front. You could use heapq module in python. Let's call this heap hp.
    3. Have a list for the results called res.
    4. Now run a loop while len(hp) > 0: and do as follows:
    • pop the 2 largest elements from the heap. O(log c) operation.
    • add one from each to res. Make sure you handle edge cases properly, if any.
    • decrement the quantity of both items. If their quantity > 0 push them back on the heap. O(log c) operation.

    At the end, you could end with one item that has no peers for inter weaving them. This can happen if the quantity of one item is larger than the sum of the quantities of all the other items. But there's no way around this. Your input data must respect this condition.

    One final note about time complexity: If the number of unique items is constant, we could drop the log c factor from the time complexity and consider it linear. This is mainly a case of how we define things.