Search code examples
dynamic-programming

Can dynamic programming help solve this problem?


Note - If you have other approach that works better than DP then please suggest them.

I have a 2D list like this [ [1,2], [1,2], [1,3], [5,4], [4,6] ]. I want to flatten this to a 1D list such that the no. of distinct values in the resulting list should be minimum like this [1,1,1,4,4]

Also, The length of subarray can be different from each other. Another eg.
[ [1,2,3], [1,3], [1,8], [5,6], [6,5], [4] ] Should transform to either [1,1,1,5,5,4] or [1,1,1,6,6,4].

I feel like dp/recursion is the way to solve this but I'm having a hard time figuring it out. Can anyone guide me?

I tried a approach of maintaining a score of the values on how many times they repeat throughout the list then choosing the value with the maximum score in the list, but this approach didn't work when there was a score tie and lists was like below.
[[1, 2],[1, 2],[1, 2],[1, 3],[3, 4],[3, 4],[4, 5, 9]]
This should transform to [1,1,1,1,4,4,4] but my approach gave me [1,1,1,1,3,3,4]


Solution

  • There's no known "good" algorithm for this problem — this is the "Hitting Set Problem" which is NP complete.

    But for small examples, you can search by maintaining a hitting set of the unique elements chosen so far, and try all possibilities when the hitting set doesn't already hit the next set.

    Something like this:

    def _smallest(xs, i, used):
        if len(xs) == i:
            return set(used)
        if any(x in used for x in xs[i]):
            return _smallest(xs, i+1, used)
        return min((_smallest(xs, i+1, used|{x}) for x in xs[i]), key=len)
    
    def smallest(xs):
        return _smallest(xs, 0, set())
    
    cases = [
        [[1,2], [1,2], [1,3], [5,4], [4,6]],
        [[1,2,3], [1,3], [1,8], [5,6], [6,5], [4]],
        [[1, 2], [1, 2], [1, 2], [1, 3], [3, 4], [3, 4], [4, 5, 9]],
    ]
    for c in cases:
        s = smallest(c)
        print(c, "->", [next(i for i in x if i in s) for x in c])
    

    Note: The runtime can be made more efficient by updating the used set in-place rather than constructing a new set each time, but the code is simpler and shorter as it appears above.

    Output:

    [[1, 2], [1, 2], [1, 3], [5, 4], [4, 6]] -> [1, 1, 1, 4, 4]
    [[1, 2, 3], [1, 3], [1, 8], [5, 6], [6, 5], [4]] -> [1, 1, 1, 5, 5, 4]
    [[1, 2], [1, 2], [1, 2], [1, 3], [3, 4], [3, 4], [4, 5, 9]] -> [1, 1, 1, 1, 4, 4, 4]